From t+dfbsd at timdarby.net Thu Dec 1 08:27:40 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Thu, 1 Dec 2016 09:27:40 -0700 Subject: Notice to Golang users Message-ID: If you use the Go compiler with Dragonfly (it's in dports), be aware that the upcoming release of Go 1.8 will require DF 4.4.4 or later: https://beta.golang.org/doc/go1.8 ?The Go team had put a workaround in place for DF for a thread signal handling bug. This was recently fixed by Matt, so they've removed the workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you can use Go 1.7 or earlier. Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From venture37 at geeklan.co.uk Sun Dec 4 07:56:48 2016 From: venture37 at geeklan.co.uk (Sevan Janiyan) Date: Sun, 4 Dec 2016 15:56:48 +0000 Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?= =?UTF-8?Q?icipation_extended?= Message-ID: Hi, The BSD devroom move the submission deadline to Dec 10th. There's still time to submit a talk ! Regards, - rodrigo On behalf of the BSD devroom Fosdem 2017 BSD devroom Call for Participation ============================================== Important dates -------------------- * Conference date : 4 & 5 February 2017 in Brussels, Belgium * Devroom date : Sunday 5 February 2016 * Submission deadline : Sunday 27 November 2016 * Speaker notified : Sunday 4 December 2016 The topic of the devroom includes all BSD operating systems and every talk is welcome from hacker discussions to real-word examples and presentations about new and shiny features. Practical -------------------- * The default duration for talks will be 45 minutes including discussion. Feel free to ask if you want to have a longer or a shorter slot. * Presentations can be recorded and streamed, sending your proposal implies giving permission to be recorded. However, exceptions can be made for exceptional circumstances. To submit your proposal, visit : https://penta.fosdem.org/submission/FOSDEM17 Account creation ---------------------- If you already have a Pentabarf account, please *don't* recreate a new one. If you forgot your password, reset it. If not, follow the instructions to create an account. Submit a talk ---------------------- Create an ?event? and click on "Show all" in the top right corner to display the full form. Your submission must include the following information * The title and subtitle of your talk (please be descriptive, as titles will be listed with ~500 from other projects) * select ?BSD devroom? as the track. * A short abstract of one paragraph * A longer description if you wish to do so * Links to related websites/blogs etc. - Rodrigo Osorio On behalf of the BSD Devroom From ipc at peercorpstrust.org Sat Dec 10 03:19:14 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 13:19:14 +0200 Subject: Parallel compression not using all available CPU Message-ID: Hi, I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does. When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top. This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET 2016. Has anyone else possibly observed this? -- Mike From justin at shiningsilence.com Sat Dec 10 12:26:32 2016 From: justin at shiningsilence.com (Justin Sherrill) Date: Sat, 10 Dec 2016 15:26:32 -0500 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: On the two DragonFly systems, was it Hammer or UFS? I would be surprised if that made a difference, but it might? On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund wrote: > Hi, > > I've observed that parallel compression tools such as pixz and lbzip2 do not > make use of all of the available CPU under Dragonfly. On other OSes, it > does. > > When testing on a 50 gb file, using top I've observed that CPU idle > percentages consistently hover around the 90% range for pixz and ~70% for > lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until > compression is complete. Correspondingly, compression takes significantly > longer under Dragonfly, so the CPU is really being under utilized in this > case as opposed to erroneous reporting by top. > > This was tested on two systems, one 16c/32t and a 2c/2t system on a recent > master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET > 2016. > > Has anyone else possibly observed this? > > -- > Mike > From ipc at peercorpstrust.org Sat Dec 10 13:14:26 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 23:14:26 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Hi, On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. On 12/10/2016 10:26 PM, Justin Sherrill wrote: > On the two DragonFly systems, was it Hammer or UFS? I would be > surprised if that made a difference, but it might? > > On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund > wrote: >> Hi, >> >> I've observed that parallel compression tools such as pixz and lbzip2 do not >> make use of all of the available CPU under Dragonfly. On other OSes, it >> does. >> >> When testing on a 50 gb file, using top I've observed that CPU idle >> percentages consistently hover around the 90% range for pixz and ~70% for >> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >> compression is complete. Correspondingly, compression takes significantly >> longer under Dragonfly, so the CPU is really being under utilized in this >> case as opposed to erroneous reporting by top. >> >> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >> 2016. >> >> Has anyone else possibly observed this? >> >> -- >> Mike >> > From dillon at apollo.backplane.com Sat Dec 10 13:29:24 2016 From: dillon at apollo.backplane.com (Matthew Dillon) Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST) Subject: Parallel compression not using all available CPU References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com> :Hi, : :On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. Generally speaking I wouldn't want these utilities to use all available cpu threads by default. It's not usually a good idea for a program to just assume that all the cpus are there for it (by default). But they should have command line options that allow the number of threads to be specified. -Matt Matthew Dillon From jasse at yberwaffe.com Sat Dec 10 16:00:02 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Sun, 11 Dec 2016 01:00:02 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Have you tried to disable hypertreads in the BIOS ??? It's a long shot, I know, but it might help. On 2016-12-10 22:14, PeerCorps Trust Fund wrote: > Hi, > > On both systems HAMMER was used. One small correction concerning the > 2c/2t machine, both compression programs did effectively utilize that > CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t > where the CPU isn't effectively maxed out. I'll continue to try and > investigate why and report back if I find anything. > > > On 12/10/2016 10:26 PM, Justin Sherrill wrote: >> On the two DragonFly systems, was it Hammer or UFS? I would be >> surprised if that made a difference, but it might? >> >> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >> wrote: >>> Hi, >>> >>> I've observed that parallel compression tools such as pixz and >>> lbzip2 do not >>> make use of all of the available CPU under Dragonfly. On other OSes, it >>> does. >>> >>> When testing on a 50 gb file, using top I've observed that CPU idle >>> percentages consistently hover around the 90% range for pixz and >>> ~70% for >>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>> idle until >>> compression is complete. Correspondingly, compression takes >>> significantly >>> longer under Dragonfly, so the CPU is really being under utilized in >>> this >>> case as opposed to erroneous reporting by top. >>> >>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>> recent >>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>> 11:44:04 EET >>> 2016. >>> >>> Has anyone else possibly observed this? >>> >>> -- >>> Mike >>> >> > From ipc at peercorpstrust.org Sun Dec 11 13:22:44 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sun, 11 Dec 2016 23:22:44 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Hi, It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk. I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU: ######################################################################## "bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache. Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2. Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time. Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB." ######################################################################## Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%). Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made. On 12/11/2016 02:00 AM, Jasse Jansson wrote: > Have you tried to disable hypertreads in the BIOS ??? > It's a long shot, I know, but it might help. > > On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> Hi, >> >> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. >> >> >> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> On the two DragonFly systems, was it Hammer or UFS? I would be >>> surprised if that made a difference, but it might? >>> >>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>> wrote: >>>> Hi, >>>> >>>> I've observed that parallel compression tools such as pixz and lbzip2 do not >>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>> does. >>>> >>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>> percentages consistently hover around the 90% range for pixz and ~70% for >>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >>>> compression is complete. Correspondingly, compression takes significantly >>>> longer under Dragonfly, so the CPU is really being under utilized in this >>>> case as opposed to erroneous reporting by top. >>>> >>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >>>> 2016. >>>> >>>> Has anyone else possibly observed this? >>>> >>>> -- >>>> Mike >>>> >>> >> > > From dillon at backplane.com Sun Dec 11 15:37:36 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sun, 11 Dec 2016 15:37:36 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: That doesn't make any sense. It sounds like it is just compressing more slowly, so there is less idle time because the HDD/SSD is able to keep up due to it compressing more slowly. You don't want to turn off hyperthreading in the BIOS and cache coherency stalls will not show up in the idle% anyway. -Matt On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < ipc at peercorpstrust.org> wrote: > Hi, > > It turns out that it was a combination of two things - turning off > hyperthreading in BIOS and using a faster disk. > > I found a post from the author of lbzip2 which seems to describe what > might be happening in this case, but reference was made to a user using an > i5 mobile CPU: > > ######################################################################## > > "bzip2 author here. I strongly suspect that you see what you see because > your Intel core i5 is probably only dual core PLUS hyper-threaded, not real > quad-core. Meaning, you have two instances of the L2 per-core cache, not > four, and each two hyperthreads share an L2 cache. > > Since the bzip2 compression/decompression is very cache sensitive (see > "man bzip2"), the scaling factor will be determined mostly by how many > OS-threads can dispose over a dedicated cache each. In your case this > number is probably 2. > > Since you run two threads per core, those contend for the shared L2 cache, > basically each messing with the other (flushing / invalidating the shared > cache for the other). This contention shows up as double CPU time, because > "waiting for cache" (or "waiting for main memory") is accounted for as CPU > time. > > Hyperthreading is not useful but detrimental for lbzip2; so you should > export LBZIP2="-n 2". You should not run more worker threads per core than: > core-dedicated-cache-size divided by 8MB." > > ######################################################################## > > Running the compression again on the same file from an SSD with > hyperthreading turned off, I was able to fully saturate all of the cores > using lbzip2. None of this seemed obvious at first, but it rectified the > situation. The biggest difference came from turning off hyperthreading > (idle CPU - 20% vs the previous 90%) and then running from an SSD with > hyperthreading turned off (idle CPU = 0%). > > Previously, the compression was run from a single HDD, not an SSD. > Concerning the compression test using the same HDD under FreeBSD, well I > don't know why it was able to saturate the CPU. Perhaps it has something to > do with ZFS's aggressive caching. Turning that off and re-running the test > would likely answer the question. Pixz performed similarly when the above > two modifications were made. > > > > > On 12/11/2016 02:00 AM, Jasse Jansson wrote: > >> Have you tried to disable hypertreads in the BIOS ??? >> It's a long shot, I know, but it might help. >> >> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> >>> Hi, >>> >>> On both systems HAMMER was used. One small correction concerning the >>> 2c/2t machine, both compression programs did effectively utilize that CPU >>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>> isn't effectively maxed out. I'll continue to try and investigate why and >>> report back if I find anything. >>> >>> >>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> >>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>> surprised if that made a difference, but it might? >>>> >>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>> do not >>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>> does. >>>>> >>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>> for >>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>> until >>>>> compression is complete. Correspondingly, compression takes >>>>> significantly >>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>> this >>>>> case as opposed to erroneous reporting by top. >>>>> >>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>> recent >>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>> EET >>>>> 2016. >>>>> >>>>> Has anyone else possibly observed this? >>>>> >>>>> -- >>>>> Mike >>>>> >>>>> >>>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Sun Dec 11 22:14:29 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 12 Dec 2016 08:14:29 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Thanks for this. Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on? On 12/12/2016 01:37 AM, Matthew Dillon wrote: > That doesn't make any sense. It sounds like it is just compressing more > slowly, so there is less idle time because the HDD/SSD is able to keep up > due to it compressing more slowly. You don't want to turn off > hyperthreading in the BIOS and cache coherency stalls will not show up in > the idle% anyway. > > -Matt > > On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < > ipc at peercorpstrust.org> wrote: > >> Hi, >> >> It turns out that it was a combination of two things - turning off >> hyperthreading in BIOS and using a faster disk. >> >> I found a post from the author of lbzip2 which seems to describe what >> might be happening in this case, but reference was made to a user using an >> i5 mobile CPU: >> >> ######################################################################## >> >> "bzip2 author here. I strongly suspect that you see what you see because >> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real >> quad-core. Meaning, you have two instances of the L2 per-core cache, not >> four, and each two hyperthreads share an L2 cache. >> >> Since the bzip2 compression/decompression is very cache sensitive (see >> "man bzip2"), the scaling factor will be determined mostly by how many >> OS-threads can dispose over a dedicated cache each. In your case this >> number is probably 2. >> >> Since you run two threads per core, those contend for the shared L2 cache, >> basically each messing with the other (flushing / invalidating the shared >> cache for the other). This contention shows up as double CPU time, because >> "waiting for cache" (or "waiting for main memory") is accounted for as CPU >> time. >> >> Hyperthreading is not useful but detrimental for lbzip2; so you should >> export LBZIP2="-n 2". You should not run more worker threads per core than: >> core-dedicated-cache-size divided by 8MB." >> >> ######################################################################## >> >> Running the compression again on the same file from an SSD with >> hyperthreading turned off, I was able to fully saturate all of the cores >> using lbzip2. None of this seemed obvious at first, but it rectified the >> situation. The biggest difference came from turning off hyperthreading >> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >> hyperthreading turned off (idle CPU = 0%). >> >> Previously, the compression was run from a single HDD, not an SSD. >> Concerning the compression test using the same HDD under FreeBSD, well I >> don't know why it was able to saturate the CPU. Perhaps it has something to >> do with ZFS's aggressive caching. Turning that off and re-running the test >> would likely answer the question. Pixz performed similarly when the above >> two modifications were made. >> >> >> >> >> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >> >>> Have you tried to disable hypertreads in the BIOS ??? >>> It's a long shot, I know, but it might help. >>> >>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>> >>>> Hi, >>>> >>>> On both systems HAMMER was used. One small correction concerning the >>>> 2c/2t machine, both compression programs did effectively utilize that CPU >>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>>> isn't effectively maxed out. I'll continue to try and investigate why and >>>> report back if I find anything. >>>> >>>> >>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>> >>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>> surprised if that made a difference, but it might? >>>>> >>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>>> do not >>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>>> does. >>>>>> >>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>>> for >>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>>> until >>>>>> compression is complete. Correspondingly, compression takes >>>>>> significantly >>>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>>> this >>>>>> case as opposed to erroneous reporting by top. >>>>>> >>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>> recent >>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>>> EET >>>>>> 2016. >>>>>> >>>>>> Has anyone else possibly observed this? >>>>>> >>>>>> -- >>>>>> Mike >>>>>> >>>>>> >>>>> >>>> >>> >>> > From jasse at yberwaffe.com Mon Dec 12 05:41:40 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Mon, 12 Dec 2016 14:41:40 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> They recommend to turn hyperthreading off if you run studio software on your computer. That's if you run Windows, I have no idea if HT affects a Unix derivative anyway. On 2016-12-12 07:14, PeerCorps Trust Fund wrote: > Thanks for this. > > Is there ever a side case where hyperthreading might have > unpredictable results or should it generally always be left on? > > On 12/12/2016 01:37 AM, Matthew Dillon wrote: >> That doesn't make any sense. It sounds like it is just compressing more >> slowly, so there is less idle time because the HDD/SSD is able to >> keep up >> due to it compressing more slowly. You don't want to turn off >> hyperthreading in the BIOS and cache coherency stalls will not show >> up in >> the idle% anyway. >> >> -Matt >> >> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < >> ipc at peercorpstrust.org> wrote: >> >>> Hi, >>> >>> It turns out that it was a combination of two things - turning off >>> hyperthreading in BIOS and using a faster disk. >>> >>> I found a post from the author of lbzip2 which seems to describe what >>> might be happening in this case, but reference was made to a user >>> using an >>> i5 mobile CPU: >>> >>> ######################################################################## >>> >>> >>> "bzip2 author here. I strongly suspect that you see what you see >>> because >>> your Intel core i5 is probably only dual core PLUS hyper-threaded, >>> not real >>> quad-core. Meaning, you have two instances of the L2 per-core cache, >>> not >>> four, and each two hyperthreads share an L2 cache. >>> >>> Since the bzip2 compression/decompression is very cache sensitive (see >>> "man bzip2"), the scaling factor will be determined mostly by how many >>> OS-threads can dispose over a dedicated cache each. In your case this >>> number is probably 2. >>> >>> Since you run two threads per core, those contend for the shared L2 >>> cache, >>> basically each messing with the other (flushing / invalidating the >>> shared >>> cache for the other). This contention shows up as double CPU time, >>> because >>> "waiting for cache" (or "waiting for main memory") is accounted for >>> as CPU >>> time. >>> >>> Hyperthreading is not useful but detrimental for lbzip2; so you should >>> export LBZIP2="-n 2". You should not run more worker threads per >>> core than: >>> core-dedicated-cache-size divided by 8MB." >>> >>> ######################################################################## >>> >>> >>> Running the compression again on the same file from an SSD with >>> hyperthreading turned off, I was able to fully saturate all of the >>> cores >>> using lbzip2. None of this seemed obvious at first, but it rectified >>> the >>> situation. The biggest difference came from turning off hyperthreading >>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >>> hyperthreading turned off (idle CPU = 0%). >>> >>> Previously, the compression was run from a single HDD, not an SSD. >>> Concerning the compression test using the same HDD under FreeBSD, >>> well I >>> don't know why it was able to saturate the CPU. Perhaps it has >>> something to >>> do with ZFS's aggressive caching. Turning that off and re-running >>> the test >>> would likely answer the question. Pixz performed similarly when the >>> above >>> two modifications were made. >>> >>> >>> >>> >>> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >>> >>>> Have you tried to disable hypertreads in the BIOS ??? >>>> It's a long shot, I know, but it might help. >>>> >>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>>> >>>>> Hi, >>>>> >>>>> On both systems HAMMER was used. One small correction concerning the >>>>> 2c/2t machine, both compression programs did effectively utilize >>>>> that CPU >>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t >>>>> where the CPU >>>>> isn't effectively maxed out. I'll continue to try and investigate >>>>> why and >>>>> report back if I find anything. >>>>> >>>>> >>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>>> >>>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>>> surprised if that made a difference, but it might? >>>>>> >>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I've observed that parallel compression tools such as pixz and >>>>>>> lbzip2 >>>>>>> do not >>>>>>> make use of all of the available CPU under Dragonfly. On other >>>>>>> OSes, it >>>>>>> does. >>>>>>> >>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>>> percentages consistently hover around the 90% range for pixz and >>>>>>> ~70% >>>>>>> for >>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>>>>>> idle >>>>>>> until >>>>>>> compression is complete. Correspondingly, compression takes >>>>>>> significantly >>>>>>> longer under Dragonfly, so the CPU is really being under >>>>>>> utilized in >>>>>>> this >>>>>>> case as opposed to erroneous reporting by top. >>>>>>> >>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>>> recent >>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>>>>>> 11:44:04 >>>>>>> EET >>>>>>> 2016. >>>>>>> >>>>>>> Has anyone else possibly observed this? >>>>>>> >>>>>>> -- >>>>>>> Mike >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >> > From dillon at backplane.com Tue Dec 13 12:56:41 2016 From: dillon at backplane.com (Matthew Dillon) Date: Tue, 13 Dec 2016 12:56:41 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> Message-ID: Never turn hyper-threading off if running (any) BSD or Linux. The schedulers understand hyperthreads. In terms of compression programs, presumably they have an option to specify how many threads to create to do the compression. Simply test with 1, 2, and 4 threads to determine the optimal number. -Matt ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:10:35 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:10:35 -0800 Subject: Graphics-related memory leak fixed in master for post-haswell machines Message-ID: A memory leak has been fixed in master's drm (gpu) code which affects post-haswell intel GPUs. i.e. Broadwell, Skylake, etc. This leak would cause a OOM panic typically after a couple of weeks of uptime. The leak does not affect older Haswell based intel systems. A number of other stability improvements have also gone into the system. The release branch got some of the stability work but the fix for the graphics memory leak is a bit too complex to backport in a compatible manner. At the current juncture, anyone running a workstation with release might want to switch to master. Master will be *much* better on broadwell or later GPUs. Also somewhat related, chromium (chrome) has been updated from v52 to v54 on both branches, which fixes some serious GUI issues on high-dpi (aka 4K) monitors. Chrome 54 still has some issues with certain sites, but generally speaking it seems to be pretty stable and I recommend its use. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:19:31 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:19:31 -0800 Subject: Setting up a more secure browser environment Message-ID: I have created a new page in the handbook describing how to set up a secure browser environment using isolation accounts. We recommend running your browser this way instead of running it directly from your main workstation account. https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/ Currently chrome is considered to be the most secure. Now that chrome 54 is in dports, it should generate a pretty good experience for users. There are some sites which still implode on it (and in FreeBSD too, that isn't a DFly-specific problem), and for those you may have to run firefox. My personal preference is chrome with some adjustments to the setup. For people running chrome on a 4K monitor, I recommend running chrome with the "--force-device-scale-factor=1.5" option before you start messing with font or other settings. It will scale the UI a little closer to what you probably want. DragonFly master's support through Intel Skylake is now very good. I am running a Skylake box as my workstation now and it has been very stable. Theoretically we have some Karby support in there too but no guarantees on stability. Radeon support has also seen some improvement though not at the same pace as Intel GPU support has seen. At this point, DragonFly runs extremely well even on mobile chipsets (though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in terms of performance). I recommend a minimum of 4GB of ram if using X. DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or NVMe SSD and a large swap partition to page to for decent browser performance. Browsers are still big memory hogs. Eventually we will have chrome 55 support which is supposed to eat much less memory, but not yet. chrome 54 is where we are now. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From karu.pruun at gmail.com Mon Dec 19 05:21:56 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Mon, 19 Dec 2016 15:21:56 +0200 Subject: question about hammer mirror-stream Message-ID: Hello A quick question regarding hammer mirror-streaming. The master and slave are mounted as # mount_null /pfs/master /home/master # mount_null /pfs/slave /home/slave and streaming is ongoing with # hammer mirror-stream /home/master /home/slave When there's a change in master, e.g. % sudo touch /home/master/fileX then fileX will show up in /pfs/slave but not in /home/slave. It will show up in the latter when I stop streaming, unmount the slave and mount it again. Is this the intended behavior? Thanks Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Mon Dec 19 08:50:06 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Mon, 19 Dec 2016 09:50:06 -0700 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > then fileX will show up in /pfs/slave but not in /home/slave. It will show > up in the latter when I stop streaming, unmount the slave and mount it > again. Is this the intended behavior? > Yes, that's how it works. You won't see the updates to the mounted slave until you unmount and re-mount. Tim? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Mon Dec 19 10:35:29 2016 From: dillon at backplane.com (Matthew Dillon) Date: Mon, 19 Dec 2016 10:35:29 -0800 Subject: AHCI driver on master - heads up Message-ID: Some work has gone into the AHCI driver on master. It should not affect anyone, but I'm posting a heads-up because this is the primary SATA storage device used on most machines and the changes made are significant enough that they could conceivably cause an issue. So if you are using master, be sure you have a good backup kernel in case something goes wrong. The easiest way to create a backup kernel is to first install the new kernel. Your previous kernel will be stripped and copied to /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: cpdup /boot/kernel.old /boot/kernel.bak This way you don't waste space in /boot copying kernels and modules with full debug symbols. Having a kernel.bak is a good idea because it will never be overwritten by an installkernel or other build target. It will show up in the loader under 'b' for backup kernel. -- Experimental FIS-Based Switching support has been added to the AHCI driver's port-multiplier code. Most AHCI chipsets do not have this capability and the few that do tend to have fairly buggy hardware, so I am not recommending that port multipliers be used generally. However, after years of waiting I finally got my hands on a mobo that had the feature and I've wanted to implement it forever, so I have done so. The FBS feature allows the computer to queue concurrent commands to multiple targets behind a port-multiplier. Without it, only one target can be addressed at a time. Test results with some old HDDs I had on a shelf can be found here. As expected, concurrent performance to multiple targets is vastly improved. http://apollo.backplane.com/DFlyMisc/portmult01.txt However, I must again stress that AHCI-based port-multipliers tend to fall-over when errors occur, or even if you are just hot-swapping a drive sometimes. Using a port-multiplier is not a good solution for serious setups. Blame Intel for intentionally screwing up the consumer AHCI spec in order to prevent it from competing with commercial SAS controllers. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Mon Dec 19 11:14:07 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 19 Dec 2016 21:14:07 +0200 Subject: AHCI driver on master - heads up In-Reply-To: References: Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org> This is interesting. AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you! On 12/19/2016 08:35 PM, Matthew Dillon wrote: > Some work has gone into the AHCI driver on master. It should not affect > anyone, but I'm posting a heads-up because this is the primary SATA storage > device used on most machines and the changes made are significant enough > that they could conceivably cause an issue. So if you are using master, be > sure you have a good backup kernel in case something goes wrong. > > The easiest way to create a backup kernel is to first install the new > kernel. Your previous kernel will be stripped and copied to > /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: > > cpdup /boot/kernel.old /boot/kernel.bak > > This way you don't waste space in /boot copying kernels and modules with > full debug symbols. Having a kernel.bak is a good idea because it will > never be overwritten by an installkernel or other build target. It will > show up in the loader under 'b' for backup kernel. > > -- > > Experimental FIS-Based Switching support has been added to the AHCI > driver's port-multiplier code. Most AHCI chipsets do not have this > capability and the few that do tend to have fairly buggy hardware, so I am > not recommending that port multipliers be used generally. However, after > years of waiting I finally got my hands on a mobo that had the feature and > I've wanted to implement it forever, so I have done so. > > The FBS feature allows the computer to queue concurrent commands to > multiple targets behind a port-multiplier. Without it, only one target can > be addressed at a time. Test results with some old HDDs I had on a shelf > can be found here. As expected, concurrent performance to multiple targets > is vastly improved. > > http://apollo.backplane.com/DFlyMisc/portmult01.txt > > However, I must again stress that AHCI-based port-multipliers tend to > fall-over when errors occur, or even if you are just hot-swapping a drive > sometimes. Using a port-multiplier is not a good solution for serious > setups. Blame Intel for intentionally screwing up the consumer AHCI spec > in order to prevent it from competing with commercial SAS controllers. > > -Matt > From karu.pruun at gmail.com Thu Dec 22 05:11:17 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Thu, 22 Dec 2016 15:11:17 +0200 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby wrote: > On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > >> then fileX will show up in /pfs/slave but not in /home/slave. It will >> show up in the latter when I stop streaming, unmount the slave and mount it >> again. Is this the intended behavior? >> > > Yes, that's how it works. You won't see the updates to the mounted slave > until you unmount and re-mount. > Thanks! Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent.delft at gmail.com Mon Dec 26 03:30:12 2016 From: vincent.delft at gmail.com (vincent delft) Date: Mon, 26 Dec 2016 12:30:12 +0100 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP Message-ID: Hello, First of all I'm new to DF. Thanks for the nice install process. It's a quite easy to have DF running. But my concerns are more linked to HammerFS. My context is the following: - I would like to have a NAS system running on a small machine: 4GB RAM having a celeron CPU of 1.6GHz. - This NAS host +- 700GB of data - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB outside the NAS for DRP reasons. - I'm looking for a solution tackling the bit-rot problem. Concerning HammerFS, I've still read lot of DF manuals, pages, ... : http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html https://www.dragonflybsd.org/hammer/ So basically, I've understood that I must install DF on a small SSD disk and will dedicated the 1TB disk to HammerFS. But I have some questions: - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 slices on the disk? - For DRP reasons, how can I perform a backup on an external disk ? (the disk is only connected 1x per month on the NAS). Should I do a "dd" ? should I use the master-slave concepts ? Should I use cpdup ? Many thanks for your replies. Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From kusumi.tomohiro at gmail.com Mon Dec 26 04:15:08 2016 From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi) Date: Mon, 26 Dec 2016 21:15:08 +0900 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: In short, hammer doesn't solve (or fix) what you call bit-rot. 2016-12-26 20:30 GMT+09:00 vincent delft : > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk and > will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 > or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? > Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 > slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? should > I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > > From ipc at peercorpstrust.org Mon Dec 26 05:06:24 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 26 Dec 2016 15:06:24 +0200 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org> Hi Vincent, Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine. Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on. HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication. Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company: https://bsdmag.org/siju_george/ HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs. There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/ In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also. If any of the above is in error, I imagine someone with deeper insight will chime in. Mike On 12/26/2016 01:30 PM, vincent delft wrote: > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk > and will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be > Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 > filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or > should I create 2 slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? > should I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > From dillon at backplane.com Wed Dec 28 14:32:55 2016 From: dillon at backplane.com (Matthew Dillon) Date: Wed, 28 Dec 2016 14:32:55 -0800 Subject: Swap changes in master, enforcement of RLIMIT_RSS Message-ID: The maximum supported swap space has been significantly increased in master. The amount of core memory required to manage swap is approximately 1:128 (ram:swap-used). The system allows up to half of your core ram to be used for swap management so the maximum swap you can actually use on a machine would be as follows: With 1GB of ram it would be 512M*128 = 64GB of swap. With 8GB of ram it would be 4G*128 = 512GB of swap. With 128GB of ram it would be around ~8TB of swap. The maximum the system supports due to KVM limitations is around ~32TB of swap. While you can configure more swap than the above limitations, the system might not be able to actually manage as much as you configure. I'm considering improving the management code to reduce its memory requirements. Also, people shouldn't go overboard w/regards to configuring swap space. Configured swap does eat some memory resources even when not actively in-use, and there might not be a practical use for having such massive amounts of swap with your setup. -- The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced on a per-process basis. By default this limit is infinity so nothing will change for most people. Setting a memoryuse limit has advantages in certain circumstances as long as it isn't over-used. For example, it might be reasonable to start your browser with a 500m (500 megabyte) limit. The system will allow each process VSZ to grow larger, but will force the RSS to the limit and devalue the pages it removes to make them more likely to be recycled by the system. There is a sysctl, vm.pageout_memuse_mode, to control the behavior. It can be set to 0, 1, or 2, and defaults to 1. Setting it to 0 disables the enforcement (system operates the same as it did before the commit). The default value of 1 enables enforcement by removing pages from the process pmap and deactivating them passively. The value 2 enables enforcement and actively pages data out and frees pages. Mode 2 can cause excessive paging and wear out your SSD so it is not currently recommended. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Thu Dec 1 08:27:40 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Thu, 1 Dec 2016 09:27:40 -0700 Subject: Notice to Golang users Message-ID: If you use the Go compiler with Dragonfly (it's in dports), be aware that the upcoming release of Go 1.8 will require DF 4.4.4 or later: https://beta.golang.org/doc/go1.8 ?The Go team had put a workaround in place for DF for a thread signal handling bug. This was recently fixed by Matt, so they've removed the workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you can use Go 1.7 or earlier. Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From venture37 at geeklan.co.uk Sun Dec 4 07:56:48 2016 From: venture37 at geeklan.co.uk (Sevan Janiyan) Date: Sun, 4 Dec 2016 15:56:48 +0000 Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?= =?UTF-8?Q?icipation_extended?= Message-ID: Hi, The BSD devroom move the submission deadline to Dec 10th. There's still time to submit a talk ! Regards, - rodrigo On behalf of the BSD devroom Fosdem 2017 BSD devroom Call for Participation ============================================== Important dates -------------------- * Conference date : 4 & 5 February 2017 in Brussels, Belgium * Devroom date : Sunday 5 February 2016 * Submission deadline : Sunday 27 November 2016 * Speaker notified : Sunday 4 December 2016 The topic of the devroom includes all BSD operating systems and every talk is welcome from hacker discussions to real-word examples and presentations about new and shiny features. Practical -------------------- * The default duration for talks will be 45 minutes including discussion. Feel free to ask if you want to have a longer or a shorter slot. * Presentations can be recorded and streamed, sending your proposal implies giving permission to be recorded. However, exceptions can be made for exceptional circumstances. To submit your proposal, visit : https://penta.fosdem.org/submission/FOSDEM17 Account creation ---------------------- If you already have a Pentabarf account, please *don't* recreate a new one. If you forgot your password, reset it. If not, follow the instructions to create an account. Submit a talk ---------------------- Create an ?event? and click on "Show all" in the top right corner to display the full form. Your submission must include the following information * The title and subtitle of your talk (please be descriptive, as titles will be listed with ~500 from other projects) * select ?BSD devroom? as the track. * A short abstract of one paragraph * A longer description if you wish to do so * Links to related websites/blogs etc. - Rodrigo Osorio On behalf of the BSD Devroom From ipc at peercorpstrust.org Sat Dec 10 03:19:14 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 13:19:14 +0200 Subject: Parallel compression not using all available CPU Message-ID: Hi, I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does. When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top. This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET 2016. Has anyone else possibly observed this? -- Mike From justin at shiningsilence.com Sat Dec 10 12:26:32 2016 From: justin at shiningsilence.com (Justin Sherrill) Date: Sat, 10 Dec 2016 15:26:32 -0500 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: On the two DragonFly systems, was it Hammer or UFS? I would be surprised if that made a difference, but it might? On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund wrote: > Hi, > > I've observed that parallel compression tools such as pixz and lbzip2 do not > make use of all of the available CPU under Dragonfly. On other OSes, it > does. > > When testing on a 50 gb file, using top I've observed that CPU idle > percentages consistently hover around the 90% range for pixz and ~70% for > lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until > compression is complete. Correspondingly, compression takes significantly > longer under Dragonfly, so the CPU is really being under utilized in this > case as opposed to erroneous reporting by top. > > This was tested on two systems, one 16c/32t and a 2c/2t system on a recent > master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET > 2016. > > Has anyone else possibly observed this? > > -- > Mike > From ipc at peercorpstrust.org Sat Dec 10 13:14:26 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 23:14:26 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Hi, On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. On 12/10/2016 10:26 PM, Justin Sherrill wrote: > On the two DragonFly systems, was it Hammer or UFS? I would be > surprised if that made a difference, but it might? > > On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund > wrote: >> Hi, >> >> I've observed that parallel compression tools such as pixz and lbzip2 do not >> make use of all of the available CPU under Dragonfly. On other OSes, it >> does. >> >> When testing on a 50 gb file, using top I've observed that CPU idle >> percentages consistently hover around the 90% range for pixz and ~70% for >> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >> compression is complete. Correspondingly, compression takes significantly >> longer under Dragonfly, so the CPU is really being under utilized in this >> case as opposed to erroneous reporting by top. >> >> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >> 2016. >> >> Has anyone else possibly observed this? >> >> -- >> Mike >> > From dillon at apollo.backplane.com Sat Dec 10 13:29:24 2016 From: dillon at apollo.backplane.com (Matthew Dillon) Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST) Subject: Parallel compression not using all available CPU References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com> :Hi, : :On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. Generally speaking I wouldn't want these utilities to use all available cpu threads by default. It's not usually a good idea for a program to just assume that all the cpus are there for it (by default). But they should have command line options that allow the number of threads to be specified. -Matt Matthew Dillon From jasse at yberwaffe.com Sat Dec 10 16:00:02 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Sun, 11 Dec 2016 01:00:02 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Have you tried to disable hypertreads in the BIOS ??? It's a long shot, I know, but it might help. On 2016-12-10 22:14, PeerCorps Trust Fund wrote: > Hi, > > On both systems HAMMER was used. One small correction concerning the > 2c/2t machine, both compression programs did effectively utilize that > CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t > where the CPU isn't effectively maxed out. I'll continue to try and > investigate why and report back if I find anything. > > > On 12/10/2016 10:26 PM, Justin Sherrill wrote: >> On the two DragonFly systems, was it Hammer or UFS? I would be >> surprised if that made a difference, but it might? >> >> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >> wrote: >>> Hi, >>> >>> I've observed that parallel compression tools such as pixz and >>> lbzip2 do not >>> make use of all of the available CPU under Dragonfly. On other OSes, it >>> does. >>> >>> When testing on a 50 gb file, using top I've observed that CPU idle >>> percentages consistently hover around the 90% range for pixz and >>> ~70% for >>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>> idle until >>> compression is complete. Correspondingly, compression takes >>> significantly >>> longer under Dragonfly, so the CPU is really being under utilized in >>> this >>> case as opposed to erroneous reporting by top. >>> >>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>> recent >>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>> 11:44:04 EET >>> 2016. >>> >>> Has anyone else possibly observed this? >>> >>> -- >>> Mike >>> >> > From ipc at peercorpstrust.org Sun Dec 11 13:22:44 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sun, 11 Dec 2016 23:22:44 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Hi, It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk. I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU: ######################################################################## "bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache. Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2. Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time. Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB." ######################################################################## Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%). Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made. On 12/11/2016 02:00 AM, Jasse Jansson wrote: > Have you tried to disable hypertreads in the BIOS ??? > It's a long shot, I know, but it might help. > > On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> Hi, >> >> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. >> >> >> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> On the two DragonFly systems, was it Hammer or UFS? I would be >>> surprised if that made a difference, but it might? >>> >>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>> wrote: >>>> Hi, >>>> >>>> I've observed that parallel compression tools such as pixz and lbzip2 do not >>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>> does. >>>> >>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>> percentages consistently hover around the 90% range for pixz and ~70% for >>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >>>> compression is complete. Correspondingly, compression takes significantly >>>> longer under Dragonfly, so the CPU is really being under utilized in this >>>> case as opposed to erroneous reporting by top. >>>> >>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >>>> 2016. >>>> >>>> Has anyone else possibly observed this? >>>> >>>> -- >>>> Mike >>>> >>> >> > > From dillon at backplane.com Sun Dec 11 15:37:36 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sun, 11 Dec 2016 15:37:36 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: That doesn't make any sense. It sounds like it is just compressing more slowly, so there is less idle time because the HDD/SSD is able to keep up due to it compressing more slowly. You don't want to turn off hyperthreading in the BIOS and cache coherency stalls will not show up in the idle% anyway. -Matt On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < ipc at peercorpstrust.org> wrote: > Hi, > > It turns out that it was a combination of two things - turning off > hyperthreading in BIOS and using a faster disk. > > I found a post from the author of lbzip2 which seems to describe what > might be happening in this case, but reference was made to a user using an > i5 mobile CPU: > > ######################################################################## > > "bzip2 author here. I strongly suspect that you see what you see because > your Intel core i5 is probably only dual core PLUS hyper-threaded, not real > quad-core. Meaning, you have two instances of the L2 per-core cache, not > four, and each two hyperthreads share an L2 cache. > > Since the bzip2 compression/decompression is very cache sensitive (see > "man bzip2"), the scaling factor will be determined mostly by how many > OS-threads can dispose over a dedicated cache each. In your case this > number is probably 2. > > Since you run two threads per core, those contend for the shared L2 cache, > basically each messing with the other (flushing / invalidating the shared > cache for the other). This contention shows up as double CPU time, because > "waiting for cache" (or "waiting for main memory") is accounted for as CPU > time. > > Hyperthreading is not useful but detrimental for lbzip2; so you should > export LBZIP2="-n 2". You should not run more worker threads per core than: > core-dedicated-cache-size divided by 8MB." > > ######################################################################## > > Running the compression again on the same file from an SSD with > hyperthreading turned off, I was able to fully saturate all of the cores > using lbzip2. None of this seemed obvious at first, but it rectified the > situation. The biggest difference came from turning off hyperthreading > (idle CPU - 20% vs the previous 90%) and then running from an SSD with > hyperthreading turned off (idle CPU = 0%). > > Previously, the compression was run from a single HDD, not an SSD. > Concerning the compression test using the same HDD under FreeBSD, well I > don't know why it was able to saturate the CPU. Perhaps it has something to > do with ZFS's aggressive caching. Turning that off and re-running the test > would likely answer the question. Pixz performed similarly when the above > two modifications were made. > > > > > On 12/11/2016 02:00 AM, Jasse Jansson wrote: > >> Have you tried to disable hypertreads in the BIOS ??? >> It's a long shot, I know, but it might help. >> >> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> >>> Hi, >>> >>> On both systems HAMMER was used. One small correction concerning the >>> 2c/2t machine, both compression programs did effectively utilize that CPU >>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>> isn't effectively maxed out. I'll continue to try and investigate why and >>> report back if I find anything. >>> >>> >>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> >>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>> surprised if that made a difference, but it might? >>>> >>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>> do not >>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>> does. >>>>> >>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>> for >>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>> until >>>>> compression is complete. Correspondingly, compression takes >>>>> significantly >>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>> this >>>>> case as opposed to erroneous reporting by top. >>>>> >>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>> recent >>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>> EET >>>>> 2016. >>>>> >>>>> Has anyone else possibly observed this? >>>>> >>>>> -- >>>>> Mike >>>>> >>>>> >>>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Sun Dec 11 22:14:29 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 12 Dec 2016 08:14:29 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Thanks for this. Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on? On 12/12/2016 01:37 AM, Matthew Dillon wrote: > That doesn't make any sense. It sounds like it is just compressing more > slowly, so there is less idle time because the HDD/SSD is able to keep up > due to it compressing more slowly. You don't want to turn off > hyperthreading in the BIOS and cache coherency stalls will not show up in > the idle% anyway. > > -Matt > > On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < > ipc at peercorpstrust.org> wrote: > >> Hi, >> >> It turns out that it was a combination of two things - turning off >> hyperthreading in BIOS and using a faster disk. >> >> I found a post from the author of lbzip2 which seems to describe what >> might be happening in this case, but reference was made to a user using an >> i5 mobile CPU: >> >> ######################################################################## >> >> "bzip2 author here. I strongly suspect that you see what you see because >> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real >> quad-core. Meaning, you have two instances of the L2 per-core cache, not >> four, and each two hyperthreads share an L2 cache. >> >> Since the bzip2 compression/decompression is very cache sensitive (see >> "man bzip2"), the scaling factor will be determined mostly by how many >> OS-threads can dispose over a dedicated cache each. In your case this >> number is probably 2. >> >> Since you run two threads per core, those contend for the shared L2 cache, >> basically each messing with the other (flushing / invalidating the shared >> cache for the other). This contention shows up as double CPU time, because >> "waiting for cache" (or "waiting for main memory") is accounted for as CPU >> time. >> >> Hyperthreading is not useful but detrimental for lbzip2; so you should >> export LBZIP2="-n 2". You should not run more worker threads per core than: >> core-dedicated-cache-size divided by 8MB." >> >> ######################################################################## >> >> Running the compression again on the same file from an SSD with >> hyperthreading turned off, I was able to fully saturate all of the cores >> using lbzip2. None of this seemed obvious at first, but it rectified the >> situation. The biggest difference came from turning off hyperthreading >> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >> hyperthreading turned off (idle CPU = 0%). >> >> Previously, the compression was run from a single HDD, not an SSD. >> Concerning the compression test using the same HDD under FreeBSD, well I >> don't know why it was able to saturate the CPU. Perhaps it has something to >> do with ZFS's aggressive caching. Turning that off and re-running the test >> would likely answer the question. Pixz performed similarly when the above >> two modifications were made. >> >> >> >> >> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >> >>> Have you tried to disable hypertreads in the BIOS ??? >>> It's a long shot, I know, but it might help. >>> >>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>> >>>> Hi, >>>> >>>> On both systems HAMMER was used. One small correction concerning the >>>> 2c/2t machine, both compression programs did effectively utilize that CPU >>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>>> isn't effectively maxed out. I'll continue to try and investigate why and >>>> report back if I find anything. >>>> >>>> >>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>> >>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>> surprised if that made a difference, but it might? >>>>> >>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>>> do not >>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>>> does. >>>>>> >>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>>> for >>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>>> until >>>>>> compression is complete. Correspondingly, compression takes >>>>>> significantly >>>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>>> this >>>>>> case as opposed to erroneous reporting by top. >>>>>> >>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>> recent >>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>>> EET >>>>>> 2016. >>>>>> >>>>>> Has anyone else possibly observed this? >>>>>> >>>>>> -- >>>>>> Mike >>>>>> >>>>>> >>>>> >>>> >>> >>> > From jasse at yberwaffe.com Mon Dec 12 05:41:40 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Mon, 12 Dec 2016 14:41:40 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> They recommend to turn hyperthreading off if you run studio software on your computer. That's if you run Windows, I have no idea if HT affects a Unix derivative anyway. On 2016-12-12 07:14, PeerCorps Trust Fund wrote: > Thanks for this. > > Is there ever a side case where hyperthreading might have > unpredictable results or should it generally always be left on? > > On 12/12/2016 01:37 AM, Matthew Dillon wrote: >> That doesn't make any sense. It sounds like it is just compressing more >> slowly, so there is less idle time because the HDD/SSD is able to >> keep up >> due to it compressing more slowly. You don't want to turn off >> hyperthreading in the BIOS and cache coherency stalls will not show >> up in >> the idle% anyway. >> >> -Matt >> >> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < >> ipc at peercorpstrust.org> wrote: >> >>> Hi, >>> >>> It turns out that it was a combination of two things - turning off >>> hyperthreading in BIOS and using a faster disk. >>> >>> I found a post from the author of lbzip2 which seems to describe what >>> might be happening in this case, but reference was made to a user >>> using an >>> i5 mobile CPU: >>> >>> ######################################################################## >>> >>> >>> "bzip2 author here. I strongly suspect that you see what you see >>> because >>> your Intel core i5 is probably only dual core PLUS hyper-threaded, >>> not real >>> quad-core. Meaning, you have two instances of the L2 per-core cache, >>> not >>> four, and each two hyperthreads share an L2 cache. >>> >>> Since the bzip2 compression/decompression is very cache sensitive (see >>> "man bzip2"), the scaling factor will be determined mostly by how many >>> OS-threads can dispose over a dedicated cache each. In your case this >>> number is probably 2. >>> >>> Since you run two threads per core, those contend for the shared L2 >>> cache, >>> basically each messing with the other (flushing / invalidating the >>> shared >>> cache for the other). This contention shows up as double CPU time, >>> because >>> "waiting for cache" (or "waiting for main memory") is accounted for >>> as CPU >>> time. >>> >>> Hyperthreading is not useful but detrimental for lbzip2; so you should >>> export LBZIP2="-n 2". You should not run more worker threads per >>> core than: >>> core-dedicated-cache-size divided by 8MB." >>> >>> ######################################################################## >>> >>> >>> Running the compression again on the same file from an SSD with >>> hyperthreading turned off, I was able to fully saturate all of the >>> cores >>> using lbzip2. None of this seemed obvious at first, but it rectified >>> the >>> situation. The biggest difference came from turning off hyperthreading >>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >>> hyperthreading turned off (idle CPU = 0%). >>> >>> Previously, the compression was run from a single HDD, not an SSD. >>> Concerning the compression test using the same HDD under FreeBSD, >>> well I >>> don't know why it was able to saturate the CPU. Perhaps it has >>> something to >>> do with ZFS's aggressive caching. Turning that off and re-running >>> the test >>> would likely answer the question. Pixz performed similarly when the >>> above >>> two modifications were made. >>> >>> >>> >>> >>> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >>> >>>> Have you tried to disable hypertreads in the BIOS ??? >>>> It's a long shot, I know, but it might help. >>>> >>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>>> >>>>> Hi, >>>>> >>>>> On both systems HAMMER was used. One small correction concerning the >>>>> 2c/2t machine, both compression programs did effectively utilize >>>>> that CPU >>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t >>>>> where the CPU >>>>> isn't effectively maxed out. I'll continue to try and investigate >>>>> why and >>>>> report back if I find anything. >>>>> >>>>> >>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>>> >>>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>>> surprised if that made a difference, but it might? >>>>>> >>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I've observed that parallel compression tools such as pixz and >>>>>>> lbzip2 >>>>>>> do not >>>>>>> make use of all of the available CPU under Dragonfly. On other >>>>>>> OSes, it >>>>>>> does. >>>>>>> >>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>>> percentages consistently hover around the 90% range for pixz and >>>>>>> ~70% >>>>>>> for >>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>>>>>> idle >>>>>>> until >>>>>>> compression is complete. Correspondingly, compression takes >>>>>>> significantly >>>>>>> longer under Dragonfly, so the CPU is really being under >>>>>>> utilized in >>>>>>> this >>>>>>> case as opposed to erroneous reporting by top. >>>>>>> >>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>>> recent >>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>>>>>> 11:44:04 >>>>>>> EET >>>>>>> 2016. >>>>>>> >>>>>>> Has anyone else possibly observed this? >>>>>>> >>>>>>> -- >>>>>>> Mike >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >> > From dillon at backplane.com Tue Dec 13 12:56:41 2016 From: dillon at backplane.com (Matthew Dillon) Date: Tue, 13 Dec 2016 12:56:41 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> Message-ID: Never turn hyper-threading off if running (any) BSD or Linux. The schedulers understand hyperthreads. In terms of compression programs, presumably they have an option to specify how many threads to create to do the compression. Simply test with 1, 2, and 4 threads to determine the optimal number. -Matt ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:10:35 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:10:35 -0800 Subject: Graphics-related memory leak fixed in master for post-haswell machines Message-ID: A memory leak has been fixed in master's drm (gpu) code which affects post-haswell intel GPUs. i.e. Broadwell, Skylake, etc. This leak would cause a OOM panic typically after a couple of weeks of uptime. The leak does not affect older Haswell based intel systems. A number of other stability improvements have also gone into the system. The release branch got some of the stability work but the fix for the graphics memory leak is a bit too complex to backport in a compatible manner. At the current juncture, anyone running a workstation with release might want to switch to master. Master will be *much* better on broadwell or later GPUs. Also somewhat related, chromium (chrome) has been updated from v52 to v54 on both branches, which fixes some serious GUI issues on high-dpi (aka 4K) monitors. Chrome 54 still has some issues with certain sites, but generally speaking it seems to be pretty stable and I recommend its use. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:19:31 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:19:31 -0800 Subject: Setting up a more secure browser environment Message-ID: I have created a new page in the handbook describing how to set up a secure browser environment using isolation accounts. We recommend running your browser this way instead of running it directly from your main workstation account. https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/ Currently chrome is considered to be the most secure. Now that chrome 54 is in dports, it should generate a pretty good experience for users. There are some sites which still implode on it (and in FreeBSD too, that isn't a DFly-specific problem), and for those you may have to run firefox. My personal preference is chrome with some adjustments to the setup. For people running chrome on a 4K monitor, I recommend running chrome with the "--force-device-scale-factor=1.5" option before you start messing with font or other settings. It will scale the UI a little closer to what you probably want. DragonFly master's support through Intel Skylake is now very good. I am running a Skylake box as my workstation now and it has been very stable. Theoretically we have some Karby support in there too but no guarantees on stability. Radeon support has also seen some improvement though not at the same pace as Intel GPU support has seen. At this point, DragonFly runs extremely well even on mobile chipsets (though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in terms of performance). I recommend a minimum of 4GB of ram if using X. DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or NVMe SSD and a large swap partition to page to for decent browser performance. Browsers are still big memory hogs. Eventually we will have chrome 55 support which is supposed to eat much less memory, but not yet. chrome 54 is where we are now. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From karu.pruun at gmail.com Mon Dec 19 05:21:56 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Mon, 19 Dec 2016 15:21:56 +0200 Subject: question about hammer mirror-stream Message-ID: Hello A quick question regarding hammer mirror-streaming. The master and slave are mounted as # mount_null /pfs/master /home/master # mount_null /pfs/slave /home/slave and streaming is ongoing with # hammer mirror-stream /home/master /home/slave When there's a change in master, e.g. % sudo touch /home/master/fileX then fileX will show up in /pfs/slave but not in /home/slave. It will show up in the latter when I stop streaming, unmount the slave and mount it again. Is this the intended behavior? Thanks Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Mon Dec 19 08:50:06 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Mon, 19 Dec 2016 09:50:06 -0700 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > then fileX will show up in /pfs/slave but not in /home/slave. It will show > up in the latter when I stop streaming, unmount the slave and mount it > again. Is this the intended behavior? > Yes, that's how it works. You won't see the updates to the mounted slave until you unmount and re-mount. Tim? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Mon Dec 19 10:35:29 2016 From: dillon at backplane.com (Matthew Dillon) Date: Mon, 19 Dec 2016 10:35:29 -0800 Subject: AHCI driver on master - heads up Message-ID: Some work has gone into the AHCI driver on master. It should not affect anyone, but I'm posting a heads-up because this is the primary SATA storage device used on most machines and the changes made are significant enough that they could conceivably cause an issue. So if you are using master, be sure you have a good backup kernel in case something goes wrong. The easiest way to create a backup kernel is to first install the new kernel. Your previous kernel will be stripped and copied to /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: cpdup /boot/kernel.old /boot/kernel.bak This way you don't waste space in /boot copying kernels and modules with full debug symbols. Having a kernel.bak is a good idea because it will never be overwritten by an installkernel or other build target. It will show up in the loader under 'b' for backup kernel. -- Experimental FIS-Based Switching support has been added to the AHCI driver's port-multiplier code. Most AHCI chipsets do not have this capability and the few that do tend to have fairly buggy hardware, so I am not recommending that port multipliers be used generally. However, after years of waiting I finally got my hands on a mobo that had the feature and I've wanted to implement it forever, so I have done so. The FBS feature allows the computer to queue concurrent commands to multiple targets behind a port-multiplier. Without it, only one target can be addressed at a time. Test results with some old HDDs I had on a shelf can be found here. As expected, concurrent performance to multiple targets is vastly improved. http://apollo.backplane.com/DFlyMisc/portmult01.txt However, I must again stress that AHCI-based port-multipliers tend to fall-over when errors occur, or even if you are just hot-swapping a drive sometimes. Using a port-multiplier is not a good solution for serious setups. Blame Intel for intentionally screwing up the consumer AHCI spec in order to prevent it from competing with commercial SAS controllers. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Mon Dec 19 11:14:07 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 19 Dec 2016 21:14:07 +0200 Subject: AHCI driver on master - heads up In-Reply-To: References: Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org> This is interesting. AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you! On 12/19/2016 08:35 PM, Matthew Dillon wrote: > Some work has gone into the AHCI driver on master. It should not affect > anyone, but I'm posting a heads-up because this is the primary SATA storage > device used on most machines and the changes made are significant enough > that they could conceivably cause an issue. So if you are using master, be > sure you have a good backup kernel in case something goes wrong. > > The easiest way to create a backup kernel is to first install the new > kernel. Your previous kernel will be stripped and copied to > /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: > > cpdup /boot/kernel.old /boot/kernel.bak > > This way you don't waste space in /boot copying kernels and modules with > full debug symbols. Having a kernel.bak is a good idea because it will > never be overwritten by an installkernel or other build target. It will > show up in the loader under 'b' for backup kernel. > > -- > > Experimental FIS-Based Switching support has been added to the AHCI > driver's port-multiplier code. Most AHCI chipsets do not have this > capability and the few that do tend to have fairly buggy hardware, so I am > not recommending that port multipliers be used generally. However, after > years of waiting I finally got my hands on a mobo that had the feature and > I've wanted to implement it forever, so I have done so. > > The FBS feature allows the computer to queue concurrent commands to > multiple targets behind a port-multiplier. Without it, only one target can > be addressed at a time. Test results with some old HDDs I had on a shelf > can be found here. As expected, concurrent performance to multiple targets > is vastly improved. > > http://apollo.backplane.com/DFlyMisc/portmult01.txt > > However, I must again stress that AHCI-based port-multipliers tend to > fall-over when errors occur, or even if you are just hot-swapping a drive > sometimes. Using a port-multiplier is not a good solution for serious > setups. Blame Intel for intentionally screwing up the consumer AHCI spec > in order to prevent it from competing with commercial SAS controllers. > > -Matt > From karu.pruun at gmail.com Thu Dec 22 05:11:17 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Thu, 22 Dec 2016 15:11:17 +0200 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby wrote: > On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > >> then fileX will show up in /pfs/slave but not in /home/slave. It will >> show up in the latter when I stop streaming, unmount the slave and mount it >> again. Is this the intended behavior? >> > > Yes, that's how it works. You won't see the updates to the mounted slave > until you unmount and re-mount. > Thanks! Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent.delft at gmail.com Mon Dec 26 03:30:12 2016 From: vincent.delft at gmail.com (vincent delft) Date: Mon, 26 Dec 2016 12:30:12 +0100 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP Message-ID: Hello, First of all I'm new to DF. Thanks for the nice install process. It's a quite easy to have DF running. But my concerns are more linked to HammerFS. My context is the following: - I would like to have a NAS system running on a small machine: 4GB RAM having a celeron CPU of 1.6GHz. - This NAS host +- 700GB of data - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB outside the NAS for DRP reasons. - I'm looking for a solution tackling the bit-rot problem. Concerning HammerFS, I've still read lot of DF manuals, pages, ... : http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html https://www.dragonflybsd.org/hammer/ So basically, I've understood that I must install DF on a small SSD disk and will dedicated the 1TB disk to HammerFS. But I have some questions: - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 slices on the disk? - For DRP reasons, how can I perform a backup on an external disk ? (the disk is only connected 1x per month on the NAS). Should I do a "dd" ? should I use the master-slave concepts ? Should I use cpdup ? Many thanks for your replies. Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From kusumi.tomohiro at gmail.com Mon Dec 26 04:15:08 2016 From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi) Date: Mon, 26 Dec 2016 21:15:08 +0900 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: In short, hammer doesn't solve (or fix) what you call bit-rot. 2016-12-26 20:30 GMT+09:00 vincent delft : > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk and > will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 > or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? > Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 > slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? should > I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > > From ipc at peercorpstrust.org Mon Dec 26 05:06:24 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 26 Dec 2016 15:06:24 +0200 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org> Hi Vincent, Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine. Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on. HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication. Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company: https://bsdmag.org/siju_george/ HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs. There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/ In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also. If any of the above is in error, I imagine someone with deeper insight will chime in. Mike On 12/26/2016 01:30 PM, vincent delft wrote: > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk > and will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be > Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 > filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or > should I create 2 slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? > should I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > From dillon at backplane.com Wed Dec 28 14:32:55 2016 From: dillon at backplane.com (Matthew Dillon) Date: Wed, 28 Dec 2016 14:32:55 -0800 Subject: Swap changes in master, enforcement of RLIMIT_RSS Message-ID: The maximum supported swap space has been significantly increased in master. The amount of core memory required to manage swap is approximately 1:128 (ram:swap-used). The system allows up to half of your core ram to be used for swap management so the maximum swap you can actually use on a machine would be as follows: With 1GB of ram it would be 512M*128 = 64GB of swap. With 8GB of ram it would be 4G*128 = 512GB of swap. With 128GB of ram it would be around ~8TB of swap. The maximum the system supports due to KVM limitations is around ~32TB of swap. While you can configure more swap than the above limitations, the system might not be able to actually manage as much as you configure. I'm considering improving the management code to reduce its memory requirements. Also, people shouldn't go overboard w/regards to configuring swap space. Configured swap does eat some memory resources even when not actively in-use, and there might not be a practical use for having such massive amounts of swap with your setup. -- The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced on a per-process basis. By default this limit is infinity so nothing will change for most people. Setting a memoryuse limit has advantages in certain circumstances as long as it isn't over-used. For example, it might be reasonable to start your browser with a 500m (500 megabyte) limit. The system will allow each process VSZ to grow larger, but will force the RSS to the limit and devalue the pages it removes to make them more likely to be recycled by the system. There is a sysctl, vm.pageout_memuse_mode, to control the behavior. It can be set to 0, 1, or 2, and defaults to 1. Setting it to 0 disables the enforcement (system operates the same as it did before the commit). The default value of 1 enables enforcement by removing pages from the process pmap and deactivating them passively. The value 2 enables enforcement and actively pages data out and frees pages. Mode 2 can cause excessive paging and wear out your SSD so it is not currently recommended. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Thu Dec 1 08:27:40 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Thu, 1 Dec 2016 09:27:40 -0700 Subject: Notice to Golang users Message-ID: If you use the Go compiler with Dragonfly (it's in dports), be aware that the upcoming release of Go 1.8 will require DF 4.4.4 or later: https://beta.golang.org/doc/go1.8 ?The Go team had put a workaround in place for DF for a thread signal handling bug. This was recently fixed by Matt, so they've removed the workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you can use Go 1.7 or earlier. Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From venture37 at geeklan.co.uk Sun Dec 4 07:56:48 2016 From: venture37 at geeklan.co.uk (Sevan Janiyan) Date: Sun, 4 Dec 2016 15:56:48 +0000 Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?= =?UTF-8?Q?icipation_extended?= Message-ID: Hi, The BSD devroom move the submission deadline to Dec 10th. There's still time to submit a talk ! Regards, - rodrigo On behalf of the BSD devroom Fosdem 2017 BSD devroom Call for Participation ============================================== Important dates -------------------- * Conference date : 4 & 5 February 2017 in Brussels, Belgium * Devroom date : Sunday 5 February 2016 * Submission deadline : Sunday 27 November 2016 * Speaker notified : Sunday 4 December 2016 The topic of the devroom includes all BSD operating systems and every talk is welcome from hacker discussions to real-word examples and presentations about new and shiny features. Practical -------------------- * The default duration for talks will be 45 minutes including discussion. Feel free to ask if you want to have a longer or a shorter slot. * Presentations can be recorded and streamed, sending your proposal implies giving permission to be recorded. However, exceptions can be made for exceptional circumstances. To submit your proposal, visit : https://penta.fosdem.org/submission/FOSDEM17 Account creation ---------------------- If you already have a Pentabarf account, please *don't* recreate a new one. If you forgot your password, reset it. If not, follow the instructions to create an account. Submit a talk ---------------------- Create an ?event? and click on "Show all" in the top right corner to display the full form. Your submission must include the following information * The title and subtitle of your talk (please be descriptive, as titles will be listed with ~500 from other projects) * select ?BSD devroom? as the track. * A short abstract of one paragraph * A longer description if you wish to do so * Links to related websites/blogs etc. - Rodrigo Osorio On behalf of the BSD Devroom From ipc at peercorpstrust.org Sat Dec 10 03:19:14 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 13:19:14 +0200 Subject: Parallel compression not using all available CPU Message-ID: Hi, I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does. When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top. This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET 2016. Has anyone else possibly observed this? -- Mike From justin at shiningsilence.com Sat Dec 10 12:26:32 2016 From: justin at shiningsilence.com (Justin Sherrill) Date: Sat, 10 Dec 2016 15:26:32 -0500 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: On the two DragonFly systems, was it Hammer or UFS? I would be surprised if that made a difference, but it might? On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund wrote: > Hi, > > I've observed that parallel compression tools such as pixz and lbzip2 do not > make use of all of the available CPU under Dragonfly. On other OSes, it > does. > > When testing on a 50 gb file, using top I've observed that CPU idle > percentages consistently hover around the 90% range for pixz and ~70% for > lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until > compression is complete. Correspondingly, compression takes significantly > longer under Dragonfly, so the CPU is really being under utilized in this > case as opposed to erroneous reporting by top. > > This was tested on two systems, one 16c/32t and a 2c/2t system on a recent > master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET > 2016. > > Has anyone else possibly observed this? > > -- > Mike > From ipc at peercorpstrust.org Sat Dec 10 13:14:26 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 23:14:26 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Hi, On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. On 12/10/2016 10:26 PM, Justin Sherrill wrote: > On the two DragonFly systems, was it Hammer or UFS? I would be > surprised if that made a difference, but it might? > > On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund > wrote: >> Hi, >> >> I've observed that parallel compression tools such as pixz and lbzip2 do not >> make use of all of the available CPU under Dragonfly. On other OSes, it >> does. >> >> When testing on a 50 gb file, using top I've observed that CPU idle >> percentages consistently hover around the 90% range for pixz and ~70% for >> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >> compression is complete. Correspondingly, compression takes significantly >> longer under Dragonfly, so the CPU is really being under utilized in this >> case as opposed to erroneous reporting by top. >> >> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >> 2016. >> >> Has anyone else possibly observed this? >> >> -- >> Mike >> > From dillon at apollo.backplane.com Sat Dec 10 13:29:24 2016 From: dillon at apollo.backplane.com (Matthew Dillon) Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST) Subject: Parallel compression not using all available CPU References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com> :Hi, : :On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. Generally speaking I wouldn't want these utilities to use all available cpu threads by default. It's not usually a good idea for a program to just assume that all the cpus are there for it (by default). But they should have command line options that allow the number of threads to be specified. -Matt Matthew Dillon From jasse at yberwaffe.com Sat Dec 10 16:00:02 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Sun, 11 Dec 2016 01:00:02 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Have you tried to disable hypertreads in the BIOS ??? It's a long shot, I know, but it might help. On 2016-12-10 22:14, PeerCorps Trust Fund wrote: > Hi, > > On both systems HAMMER was used. One small correction concerning the > 2c/2t machine, both compression programs did effectively utilize that > CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t > where the CPU isn't effectively maxed out. I'll continue to try and > investigate why and report back if I find anything. > > > On 12/10/2016 10:26 PM, Justin Sherrill wrote: >> On the two DragonFly systems, was it Hammer or UFS? I would be >> surprised if that made a difference, but it might? >> >> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >> wrote: >>> Hi, >>> >>> I've observed that parallel compression tools such as pixz and >>> lbzip2 do not >>> make use of all of the available CPU under Dragonfly. On other OSes, it >>> does. >>> >>> When testing on a 50 gb file, using top I've observed that CPU idle >>> percentages consistently hover around the 90% range for pixz and >>> ~70% for >>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>> idle until >>> compression is complete. Correspondingly, compression takes >>> significantly >>> longer under Dragonfly, so the CPU is really being under utilized in >>> this >>> case as opposed to erroneous reporting by top. >>> >>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>> recent >>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>> 11:44:04 EET >>> 2016. >>> >>> Has anyone else possibly observed this? >>> >>> -- >>> Mike >>> >> > From ipc at peercorpstrust.org Sun Dec 11 13:22:44 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sun, 11 Dec 2016 23:22:44 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Hi, It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk. I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU: ######################################################################## "bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache. Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2. Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time. Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB." ######################################################################## Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%). Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made. On 12/11/2016 02:00 AM, Jasse Jansson wrote: > Have you tried to disable hypertreads in the BIOS ??? > It's a long shot, I know, but it might help. > > On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> Hi, >> >> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. >> >> >> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> On the two DragonFly systems, was it Hammer or UFS? I would be >>> surprised if that made a difference, but it might? >>> >>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>> wrote: >>>> Hi, >>>> >>>> I've observed that parallel compression tools such as pixz and lbzip2 do not >>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>> does. >>>> >>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>> percentages consistently hover around the 90% range for pixz and ~70% for >>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >>>> compression is complete. Correspondingly, compression takes significantly >>>> longer under Dragonfly, so the CPU is really being under utilized in this >>>> case as opposed to erroneous reporting by top. >>>> >>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >>>> 2016. >>>> >>>> Has anyone else possibly observed this? >>>> >>>> -- >>>> Mike >>>> >>> >> > > From dillon at backplane.com Sun Dec 11 15:37:36 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sun, 11 Dec 2016 15:37:36 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: That doesn't make any sense. It sounds like it is just compressing more slowly, so there is less idle time because the HDD/SSD is able to keep up due to it compressing more slowly. You don't want to turn off hyperthreading in the BIOS and cache coherency stalls will not show up in the idle% anyway. -Matt On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < ipc at peercorpstrust.org> wrote: > Hi, > > It turns out that it was a combination of two things - turning off > hyperthreading in BIOS and using a faster disk. > > I found a post from the author of lbzip2 which seems to describe what > might be happening in this case, but reference was made to a user using an > i5 mobile CPU: > > ######################################################################## > > "bzip2 author here. I strongly suspect that you see what you see because > your Intel core i5 is probably only dual core PLUS hyper-threaded, not real > quad-core. Meaning, you have two instances of the L2 per-core cache, not > four, and each two hyperthreads share an L2 cache. > > Since the bzip2 compression/decompression is very cache sensitive (see > "man bzip2"), the scaling factor will be determined mostly by how many > OS-threads can dispose over a dedicated cache each. In your case this > number is probably 2. > > Since you run two threads per core, those contend for the shared L2 cache, > basically each messing with the other (flushing / invalidating the shared > cache for the other). This contention shows up as double CPU time, because > "waiting for cache" (or "waiting for main memory") is accounted for as CPU > time. > > Hyperthreading is not useful but detrimental for lbzip2; so you should > export LBZIP2="-n 2". You should not run more worker threads per core than: > core-dedicated-cache-size divided by 8MB." > > ######################################################################## > > Running the compression again on the same file from an SSD with > hyperthreading turned off, I was able to fully saturate all of the cores > using lbzip2. None of this seemed obvious at first, but it rectified the > situation. The biggest difference came from turning off hyperthreading > (idle CPU - 20% vs the previous 90%) and then running from an SSD with > hyperthreading turned off (idle CPU = 0%). > > Previously, the compression was run from a single HDD, not an SSD. > Concerning the compression test using the same HDD under FreeBSD, well I > don't know why it was able to saturate the CPU. Perhaps it has something to > do with ZFS's aggressive caching. Turning that off and re-running the test > would likely answer the question. Pixz performed similarly when the above > two modifications were made. > > > > > On 12/11/2016 02:00 AM, Jasse Jansson wrote: > >> Have you tried to disable hypertreads in the BIOS ??? >> It's a long shot, I know, but it might help. >> >> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> >>> Hi, >>> >>> On both systems HAMMER was used. One small correction concerning the >>> 2c/2t machine, both compression programs did effectively utilize that CPU >>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>> isn't effectively maxed out. I'll continue to try and investigate why and >>> report back if I find anything. >>> >>> >>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> >>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>> surprised if that made a difference, but it might? >>>> >>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>> do not >>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>> does. >>>>> >>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>> for >>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>> until >>>>> compression is complete. Correspondingly, compression takes >>>>> significantly >>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>> this >>>>> case as opposed to erroneous reporting by top. >>>>> >>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>> recent >>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>> EET >>>>> 2016. >>>>> >>>>> Has anyone else possibly observed this? >>>>> >>>>> -- >>>>> Mike >>>>> >>>>> >>>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Sun Dec 11 22:14:29 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 12 Dec 2016 08:14:29 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Thanks for this. Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on? On 12/12/2016 01:37 AM, Matthew Dillon wrote: > That doesn't make any sense. It sounds like it is just compressing more > slowly, so there is less idle time because the HDD/SSD is able to keep up > due to it compressing more slowly. You don't want to turn off > hyperthreading in the BIOS and cache coherency stalls will not show up in > the idle% anyway. > > -Matt > > On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < > ipc at peercorpstrust.org> wrote: > >> Hi, >> >> It turns out that it was a combination of two things - turning off >> hyperthreading in BIOS and using a faster disk. >> >> I found a post from the author of lbzip2 which seems to describe what >> might be happening in this case, but reference was made to a user using an >> i5 mobile CPU: >> >> ######################################################################## >> >> "bzip2 author here. I strongly suspect that you see what you see because >> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real >> quad-core. Meaning, you have two instances of the L2 per-core cache, not >> four, and each two hyperthreads share an L2 cache. >> >> Since the bzip2 compression/decompression is very cache sensitive (see >> "man bzip2"), the scaling factor will be determined mostly by how many >> OS-threads can dispose over a dedicated cache each. In your case this >> number is probably 2. >> >> Since you run two threads per core, those contend for the shared L2 cache, >> basically each messing with the other (flushing / invalidating the shared >> cache for the other). This contention shows up as double CPU time, because >> "waiting for cache" (or "waiting for main memory") is accounted for as CPU >> time. >> >> Hyperthreading is not useful but detrimental for lbzip2; so you should >> export LBZIP2="-n 2". You should not run more worker threads per core than: >> core-dedicated-cache-size divided by 8MB." >> >> ######################################################################## >> >> Running the compression again on the same file from an SSD with >> hyperthreading turned off, I was able to fully saturate all of the cores >> using lbzip2. None of this seemed obvious at first, but it rectified the >> situation. The biggest difference came from turning off hyperthreading >> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >> hyperthreading turned off (idle CPU = 0%). >> >> Previously, the compression was run from a single HDD, not an SSD. >> Concerning the compression test using the same HDD under FreeBSD, well I >> don't know why it was able to saturate the CPU. Perhaps it has something to >> do with ZFS's aggressive caching. Turning that off and re-running the test >> would likely answer the question. Pixz performed similarly when the above >> two modifications were made. >> >> >> >> >> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >> >>> Have you tried to disable hypertreads in the BIOS ??? >>> It's a long shot, I know, but it might help. >>> >>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>> >>>> Hi, >>>> >>>> On both systems HAMMER was used. One small correction concerning the >>>> 2c/2t machine, both compression programs did effectively utilize that CPU >>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>>> isn't effectively maxed out. I'll continue to try and investigate why and >>>> report back if I find anything. >>>> >>>> >>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>> >>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>> surprised if that made a difference, but it might? >>>>> >>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>>> do not >>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>>> does. >>>>>> >>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>>> for >>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>>> until >>>>>> compression is complete. Correspondingly, compression takes >>>>>> significantly >>>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>>> this >>>>>> case as opposed to erroneous reporting by top. >>>>>> >>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>> recent >>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>>> EET >>>>>> 2016. >>>>>> >>>>>> Has anyone else possibly observed this? >>>>>> >>>>>> -- >>>>>> Mike >>>>>> >>>>>> >>>>> >>>> >>> >>> > From jasse at yberwaffe.com Mon Dec 12 05:41:40 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Mon, 12 Dec 2016 14:41:40 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> They recommend to turn hyperthreading off if you run studio software on your computer. That's if you run Windows, I have no idea if HT affects a Unix derivative anyway. On 2016-12-12 07:14, PeerCorps Trust Fund wrote: > Thanks for this. > > Is there ever a side case where hyperthreading might have > unpredictable results or should it generally always be left on? > > On 12/12/2016 01:37 AM, Matthew Dillon wrote: >> That doesn't make any sense. It sounds like it is just compressing more >> slowly, so there is less idle time because the HDD/SSD is able to >> keep up >> due to it compressing more slowly. You don't want to turn off >> hyperthreading in the BIOS and cache coherency stalls will not show >> up in >> the idle% anyway. >> >> -Matt >> >> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < >> ipc at peercorpstrust.org> wrote: >> >>> Hi, >>> >>> It turns out that it was a combination of two things - turning off >>> hyperthreading in BIOS and using a faster disk. >>> >>> I found a post from the author of lbzip2 which seems to describe what >>> might be happening in this case, but reference was made to a user >>> using an >>> i5 mobile CPU: >>> >>> ######################################################################## >>> >>> >>> "bzip2 author here. I strongly suspect that you see what you see >>> because >>> your Intel core i5 is probably only dual core PLUS hyper-threaded, >>> not real >>> quad-core. Meaning, you have two instances of the L2 per-core cache, >>> not >>> four, and each two hyperthreads share an L2 cache. >>> >>> Since the bzip2 compression/decompression is very cache sensitive (see >>> "man bzip2"), the scaling factor will be determined mostly by how many >>> OS-threads can dispose over a dedicated cache each. In your case this >>> number is probably 2. >>> >>> Since you run two threads per core, those contend for the shared L2 >>> cache, >>> basically each messing with the other (flushing / invalidating the >>> shared >>> cache for the other). This contention shows up as double CPU time, >>> because >>> "waiting for cache" (or "waiting for main memory") is accounted for >>> as CPU >>> time. >>> >>> Hyperthreading is not useful but detrimental for lbzip2; so you should >>> export LBZIP2="-n 2". You should not run more worker threads per >>> core than: >>> core-dedicated-cache-size divided by 8MB." >>> >>> ######################################################################## >>> >>> >>> Running the compression again on the same file from an SSD with >>> hyperthreading turned off, I was able to fully saturate all of the >>> cores >>> using lbzip2. None of this seemed obvious at first, but it rectified >>> the >>> situation. The biggest difference came from turning off hyperthreading >>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >>> hyperthreading turned off (idle CPU = 0%). >>> >>> Previously, the compression was run from a single HDD, not an SSD. >>> Concerning the compression test using the same HDD under FreeBSD, >>> well I >>> don't know why it was able to saturate the CPU. Perhaps it has >>> something to >>> do with ZFS's aggressive caching. Turning that off and re-running >>> the test >>> would likely answer the question. Pixz performed similarly when the >>> above >>> two modifications were made. >>> >>> >>> >>> >>> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >>> >>>> Have you tried to disable hypertreads in the BIOS ??? >>>> It's a long shot, I know, but it might help. >>>> >>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>>> >>>>> Hi, >>>>> >>>>> On both systems HAMMER was used. One small correction concerning the >>>>> 2c/2t machine, both compression programs did effectively utilize >>>>> that CPU >>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t >>>>> where the CPU >>>>> isn't effectively maxed out. I'll continue to try and investigate >>>>> why and >>>>> report back if I find anything. >>>>> >>>>> >>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>>> >>>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>>> surprised if that made a difference, but it might? >>>>>> >>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I've observed that parallel compression tools such as pixz and >>>>>>> lbzip2 >>>>>>> do not >>>>>>> make use of all of the available CPU under Dragonfly. On other >>>>>>> OSes, it >>>>>>> does. >>>>>>> >>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>>> percentages consistently hover around the 90% range for pixz and >>>>>>> ~70% >>>>>>> for >>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>>>>>> idle >>>>>>> until >>>>>>> compression is complete. Correspondingly, compression takes >>>>>>> significantly >>>>>>> longer under Dragonfly, so the CPU is really being under >>>>>>> utilized in >>>>>>> this >>>>>>> case as opposed to erroneous reporting by top. >>>>>>> >>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>>> recent >>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>>>>>> 11:44:04 >>>>>>> EET >>>>>>> 2016. >>>>>>> >>>>>>> Has anyone else possibly observed this? >>>>>>> >>>>>>> -- >>>>>>> Mike >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >> > From dillon at backplane.com Tue Dec 13 12:56:41 2016 From: dillon at backplane.com (Matthew Dillon) Date: Tue, 13 Dec 2016 12:56:41 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> Message-ID: Never turn hyper-threading off if running (any) BSD or Linux. The schedulers understand hyperthreads. In terms of compression programs, presumably they have an option to specify how many threads to create to do the compression. Simply test with 1, 2, and 4 threads to determine the optimal number. -Matt ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:10:35 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:10:35 -0800 Subject: Graphics-related memory leak fixed in master for post-haswell machines Message-ID: A memory leak has been fixed in master's drm (gpu) code which affects post-haswell intel GPUs. i.e. Broadwell, Skylake, etc. This leak would cause a OOM panic typically after a couple of weeks of uptime. The leak does not affect older Haswell based intel systems. A number of other stability improvements have also gone into the system. The release branch got some of the stability work but the fix for the graphics memory leak is a bit too complex to backport in a compatible manner. At the current juncture, anyone running a workstation with release might want to switch to master. Master will be *much* better on broadwell or later GPUs. Also somewhat related, chromium (chrome) has been updated from v52 to v54 on both branches, which fixes some serious GUI issues on high-dpi (aka 4K) monitors. Chrome 54 still has some issues with certain sites, but generally speaking it seems to be pretty stable and I recommend its use. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:19:31 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:19:31 -0800 Subject: Setting up a more secure browser environment Message-ID: I have created a new page in the handbook describing how to set up a secure browser environment using isolation accounts. We recommend running your browser this way instead of running it directly from your main workstation account. https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/ Currently chrome is considered to be the most secure. Now that chrome 54 is in dports, it should generate a pretty good experience for users. There are some sites which still implode on it (and in FreeBSD too, that isn't a DFly-specific problem), and for those you may have to run firefox. My personal preference is chrome with some adjustments to the setup. For people running chrome on a 4K monitor, I recommend running chrome with the "--force-device-scale-factor=1.5" option before you start messing with font or other settings. It will scale the UI a little closer to what you probably want. DragonFly master's support through Intel Skylake is now very good. I am running a Skylake box as my workstation now and it has been very stable. Theoretically we have some Karby support in there too but no guarantees on stability. Radeon support has also seen some improvement though not at the same pace as Intel GPU support has seen. At this point, DragonFly runs extremely well even on mobile chipsets (though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in terms of performance). I recommend a minimum of 4GB of ram if using X. DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or NVMe SSD and a large swap partition to page to for decent browser performance. Browsers are still big memory hogs. Eventually we will have chrome 55 support which is supposed to eat much less memory, but not yet. chrome 54 is where we are now. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From karu.pruun at gmail.com Mon Dec 19 05:21:56 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Mon, 19 Dec 2016 15:21:56 +0200 Subject: question about hammer mirror-stream Message-ID: Hello A quick question regarding hammer mirror-streaming. The master and slave are mounted as # mount_null /pfs/master /home/master # mount_null /pfs/slave /home/slave and streaming is ongoing with # hammer mirror-stream /home/master /home/slave When there's a change in master, e.g. % sudo touch /home/master/fileX then fileX will show up in /pfs/slave but not in /home/slave. It will show up in the latter when I stop streaming, unmount the slave and mount it again. Is this the intended behavior? Thanks Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Mon Dec 19 08:50:06 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Mon, 19 Dec 2016 09:50:06 -0700 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > then fileX will show up in /pfs/slave but not in /home/slave. It will show > up in the latter when I stop streaming, unmount the slave and mount it > again. Is this the intended behavior? > Yes, that's how it works. You won't see the updates to the mounted slave until you unmount and re-mount. Tim? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Mon Dec 19 10:35:29 2016 From: dillon at backplane.com (Matthew Dillon) Date: Mon, 19 Dec 2016 10:35:29 -0800 Subject: AHCI driver on master - heads up Message-ID: Some work has gone into the AHCI driver on master. It should not affect anyone, but I'm posting a heads-up because this is the primary SATA storage device used on most machines and the changes made are significant enough that they could conceivably cause an issue. So if you are using master, be sure you have a good backup kernel in case something goes wrong. The easiest way to create a backup kernel is to first install the new kernel. Your previous kernel will be stripped and copied to /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: cpdup /boot/kernel.old /boot/kernel.bak This way you don't waste space in /boot copying kernels and modules with full debug symbols. Having a kernel.bak is a good idea because it will never be overwritten by an installkernel or other build target. It will show up in the loader under 'b' for backup kernel. -- Experimental FIS-Based Switching support has been added to the AHCI driver's port-multiplier code. Most AHCI chipsets do not have this capability and the few that do tend to have fairly buggy hardware, so I am not recommending that port multipliers be used generally. However, after years of waiting I finally got my hands on a mobo that had the feature and I've wanted to implement it forever, so I have done so. The FBS feature allows the computer to queue concurrent commands to multiple targets behind a port-multiplier. Without it, only one target can be addressed at a time. Test results with some old HDDs I had on a shelf can be found here. As expected, concurrent performance to multiple targets is vastly improved. http://apollo.backplane.com/DFlyMisc/portmult01.txt However, I must again stress that AHCI-based port-multipliers tend to fall-over when errors occur, or even if you are just hot-swapping a drive sometimes. Using a port-multiplier is not a good solution for serious setups. Blame Intel for intentionally screwing up the consumer AHCI spec in order to prevent it from competing with commercial SAS controllers. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Mon Dec 19 11:14:07 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 19 Dec 2016 21:14:07 +0200 Subject: AHCI driver on master - heads up In-Reply-To: References: Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org> This is interesting. AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you! On 12/19/2016 08:35 PM, Matthew Dillon wrote: > Some work has gone into the AHCI driver on master. It should not affect > anyone, but I'm posting a heads-up because this is the primary SATA storage > device used on most machines and the changes made are significant enough > that they could conceivably cause an issue. So if you are using master, be > sure you have a good backup kernel in case something goes wrong. > > The easiest way to create a backup kernel is to first install the new > kernel. Your previous kernel will be stripped and copied to > /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: > > cpdup /boot/kernel.old /boot/kernel.bak > > This way you don't waste space in /boot copying kernels and modules with > full debug symbols. Having a kernel.bak is a good idea because it will > never be overwritten by an installkernel or other build target. It will > show up in the loader under 'b' for backup kernel. > > -- > > Experimental FIS-Based Switching support has been added to the AHCI > driver's port-multiplier code. Most AHCI chipsets do not have this > capability and the few that do tend to have fairly buggy hardware, so I am > not recommending that port multipliers be used generally. However, after > years of waiting I finally got my hands on a mobo that had the feature and > I've wanted to implement it forever, so I have done so. > > The FBS feature allows the computer to queue concurrent commands to > multiple targets behind a port-multiplier. Without it, only one target can > be addressed at a time. Test results with some old HDDs I had on a shelf > can be found here. As expected, concurrent performance to multiple targets > is vastly improved. > > http://apollo.backplane.com/DFlyMisc/portmult01.txt > > However, I must again stress that AHCI-based port-multipliers tend to > fall-over when errors occur, or even if you are just hot-swapping a drive > sometimes. Using a port-multiplier is not a good solution for serious > setups. Blame Intel for intentionally screwing up the consumer AHCI spec > in order to prevent it from competing with commercial SAS controllers. > > -Matt > From karu.pruun at gmail.com Thu Dec 22 05:11:17 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Thu, 22 Dec 2016 15:11:17 +0200 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby wrote: > On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > >> then fileX will show up in /pfs/slave but not in /home/slave. It will >> show up in the latter when I stop streaming, unmount the slave and mount it >> again. Is this the intended behavior? >> > > Yes, that's how it works. You won't see the updates to the mounted slave > until you unmount and re-mount. > Thanks! Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent.delft at gmail.com Mon Dec 26 03:30:12 2016 From: vincent.delft at gmail.com (vincent delft) Date: Mon, 26 Dec 2016 12:30:12 +0100 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP Message-ID: Hello, First of all I'm new to DF. Thanks for the nice install process. It's a quite easy to have DF running. But my concerns are more linked to HammerFS. My context is the following: - I would like to have a NAS system running on a small machine: 4GB RAM having a celeron CPU of 1.6GHz. - This NAS host +- 700GB of data - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB outside the NAS for DRP reasons. - I'm looking for a solution tackling the bit-rot problem. Concerning HammerFS, I've still read lot of DF manuals, pages, ... : http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html https://www.dragonflybsd.org/hammer/ So basically, I've understood that I must install DF on a small SSD disk and will dedicated the 1TB disk to HammerFS. But I have some questions: - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 slices on the disk? - For DRP reasons, how can I perform a backup on an external disk ? (the disk is only connected 1x per month on the NAS). Should I do a "dd" ? should I use the master-slave concepts ? Should I use cpdup ? Many thanks for your replies. Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From kusumi.tomohiro at gmail.com Mon Dec 26 04:15:08 2016 From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi) Date: Mon, 26 Dec 2016 21:15:08 +0900 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: In short, hammer doesn't solve (or fix) what you call bit-rot. 2016-12-26 20:30 GMT+09:00 vincent delft : > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk and > will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 > or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? > Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 > slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? should > I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > > From ipc at peercorpstrust.org Mon Dec 26 05:06:24 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 26 Dec 2016 15:06:24 +0200 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org> Hi Vincent, Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine. Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on. HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication. Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company: https://bsdmag.org/siju_george/ HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs. There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/ In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also. If any of the above is in error, I imagine someone with deeper insight will chime in. Mike On 12/26/2016 01:30 PM, vincent delft wrote: > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk > and will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be > Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 > filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or > should I create 2 slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? > should I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > From dillon at backplane.com Wed Dec 28 14:32:55 2016 From: dillon at backplane.com (Matthew Dillon) Date: Wed, 28 Dec 2016 14:32:55 -0800 Subject: Swap changes in master, enforcement of RLIMIT_RSS Message-ID: The maximum supported swap space has been significantly increased in master. The amount of core memory required to manage swap is approximately 1:128 (ram:swap-used). The system allows up to half of your core ram to be used for swap management so the maximum swap you can actually use on a machine would be as follows: With 1GB of ram it would be 512M*128 = 64GB of swap. With 8GB of ram it would be 4G*128 = 512GB of swap. With 128GB of ram it would be around ~8TB of swap. The maximum the system supports due to KVM limitations is around ~32TB of swap. While you can configure more swap than the above limitations, the system might not be able to actually manage as much as you configure. I'm considering improving the management code to reduce its memory requirements. Also, people shouldn't go overboard w/regards to configuring swap space. Configured swap does eat some memory resources even when not actively in-use, and there might not be a practical use for having such massive amounts of swap with your setup. -- The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced on a per-process basis. By default this limit is infinity so nothing will change for most people. Setting a memoryuse limit has advantages in certain circumstances as long as it isn't over-used. For example, it might be reasonable to start your browser with a 500m (500 megabyte) limit. The system will allow each process VSZ to grow larger, but will force the RSS to the limit and devalue the pages it removes to make them more likely to be recycled by the system. There is a sysctl, vm.pageout_memuse_mode, to control the behavior. It can be set to 0, 1, or 2, and defaults to 1. Setting it to 0 disables the enforcement (system operates the same as it did before the commit). The default value of 1 enables enforcement by removing pages from the process pmap and deactivating them passively. The value 2 enables enforcement and actively pages data out and frees pages. Mode 2 can cause excessive paging and wear out your SSD so it is not currently recommended. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Thu Dec 1 08:27:40 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Thu, 1 Dec 2016 09:27:40 -0700 Subject: Notice to Golang users Message-ID: If you use the Go compiler with Dragonfly (it's in dports), be aware that the upcoming release of Go 1.8 will require DF 4.4.4 or later: https://beta.golang.org/doc/go1.8 ?The Go team had put a workaround in place for DF for a thread signal handling bug. This was recently fixed by Matt, so they've removed the workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you can use Go 1.7 or earlier. Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From venture37 at geeklan.co.uk Sun Dec 4 07:56:48 2016 From: venture37 at geeklan.co.uk (Sevan Janiyan) Date: Sun, 4 Dec 2016 15:56:48 +0000 Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?= =?UTF-8?Q?icipation_extended?= Message-ID: Hi, The BSD devroom move the submission deadline to Dec 10th. There's still time to submit a talk ! Regards, - rodrigo On behalf of the BSD devroom Fosdem 2017 BSD devroom Call for Participation ============================================== Important dates -------------------- * Conference date : 4 & 5 February 2017 in Brussels, Belgium * Devroom date : Sunday 5 February 2016 * Submission deadline : Sunday 27 November 2016 * Speaker notified : Sunday 4 December 2016 The topic of the devroom includes all BSD operating systems and every talk is welcome from hacker discussions to real-word examples and presentations about new and shiny features. Practical -------------------- * The default duration for talks will be 45 minutes including discussion. Feel free to ask if you want to have a longer or a shorter slot. * Presentations can be recorded and streamed, sending your proposal implies giving permission to be recorded. However, exceptions can be made for exceptional circumstances. To submit your proposal, visit : https://penta.fosdem.org/submission/FOSDEM17 Account creation ---------------------- If you already have a Pentabarf account, please *don't* recreate a new one. If you forgot your password, reset it. If not, follow the instructions to create an account. Submit a talk ---------------------- Create an ?event? and click on "Show all" in the top right corner to display the full form. Your submission must include the following information * The title and subtitle of your talk (please be descriptive, as titles will be listed with ~500 from other projects) * select ?BSD devroom? as the track. * A short abstract of one paragraph * A longer description if you wish to do so * Links to related websites/blogs etc. - Rodrigo Osorio On behalf of the BSD Devroom From ipc at peercorpstrust.org Sat Dec 10 03:19:14 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 13:19:14 +0200 Subject: Parallel compression not using all available CPU Message-ID: Hi, I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does. When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top. This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET 2016. Has anyone else possibly observed this? -- Mike From justin at shiningsilence.com Sat Dec 10 12:26:32 2016 From: justin at shiningsilence.com (Justin Sherrill) Date: Sat, 10 Dec 2016 15:26:32 -0500 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: On the two DragonFly systems, was it Hammer or UFS? I would be surprised if that made a difference, but it might? On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund wrote: > Hi, > > I've observed that parallel compression tools such as pixz and lbzip2 do not > make use of all of the available CPU under Dragonfly. On other OSes, it > does. > > When testing on a 50 gb file, using top I've observed that CPU idle > percentages consistently hover around the 90% range for pixz and ~70% for > lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until > compression is complete. Correspondingly, compression takes significantly > longer under Dragonfly, so the CPU is really being under utilized in this > case as opposed to erroneous reporting by top. > > This was tested on two systems, one 16c/32t and a 2c/2t system on a recent > master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET > 2016. > > Has anyone else possibly observed this? > > -- > Mike > From ipc at peercorpstrust.org Sat Dec 10 13:14:26 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 23:14:26 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Hi, On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. On 12/10/2016 10:26 PM, Justin Sherrill wrote: > On the two DragonFly systems, was it Hammer or UFS? I would be > surprised if that made a difference, but it might? > > On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund > wrote: >> Hi, >> >> I've observed that parallel compression tools such as pixz and lbzip2 do not >> make use of all of the available CPU under Dragonfly. On other OSes, it >> does. >> >> When testing on a 50 gb file, using top I've observed that CPU idle >> percentages consistently hover around the 90% range for pixz and ~70% for >> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >> compression is complete. Correspondingly, compression takes significantly >> longer under Dragonfly, so the CPU is really being under utilized in this >> case as opposed to erroneous reporting by top. >> >> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >> 2016. >> >> Has anyone else possibly observed this? >> >> -- >> Mike >> > From dillon at apollo.backplane.com Sat Dec 10 13:29:24 2016 From: dillon at apollo.backplane.com (Matthew Dillon) Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST) Subject: Parallel compression not using all available CPU References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com> :Hi, : :On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. Generally speaking I wouldn't want these utilities to use all available cpu threads by default. It's not usually a good idea for a program to just assume that all the cpus are there for it (by default). But they should have command line options that allow the number of threads to be specified. -Matt Matthew Dillon From jasse at yberwaffe.com Sat Dec 10 16:00:02 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Sun, 11 Dec 2016 01:00:02 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Have you tried to disable hypertreads in the BIOS ??? It's a long shot, I know, but it might help. On 2016-12-10 22:14, PeerCorps Trust Fund wrote: > Hi, > > On both systems HAMMER was used. One small correction concerning the > 2c/2t machine, both compression programs did effectively utilize that > CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t > where the CPU isn't effectively maxed out. I'll continue to try and > investigate why and report back if I find anything. > > > On 12/10/2016 10:26 PM, Justin Sherrill wrote: >> On the two DragonFly systems, was it Hammer or UFS? I would be >> surprised if that made a difference, but it might? >> >> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >> wrote: >>> Hi, >>> >>> I've observed that parallel compression tools such as pixz and >>> lbzip2 do not >>> make use of all of the available CPU under Dragonfly. On other OSes, it >>> does. >>> >>> When testing on a 50 gb file, using top I've observed that CPU idle >>> percentages consistently hover around the 90% range for pixz and >>> ~70% for >>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>> idle until >>> compression is complete. Correspondingly, compression takes >>> significantly >>> longer under Dragonfly, so the CPU is really being under utilized in >>> this >>> case as opposed to erroneous reporting by top. >>> >>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>> recent >>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>> 11:44:04 EET >>> 2016. >>> >>> Has anyone else possibly observed this? >>> >>> -- >>> Mike >>> >> > From ipc at peercorpstrust.org Sun Dec 11 13:22:44 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sun, 11 Dec 2016 23:22:44 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Hi, It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk. I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU: ######################################################################## "bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache. Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2. Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time. Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB." ######################################################################## Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%). Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made. On 12/11/2016 02:00 AM, Jasse Jansson wrote: > Have you tried to disable hypertreads in the BIOS ??? > It's a long shot, I know, but it might help. > > On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> Hi, >> >> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. >> >> >> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> On the two DragonFly systems, was it Hammer or UFS? I would be >>> surprised if that made a difference, but it might? >>> >>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>> wrote: >>>> Hi, >>>> >>>> I've observed that parallel compression tools such as pixz and lbzip2 do not >>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>> does. >>>> >>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>> percentages consistently hover around the 90% range for pixz and ~70% for >>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >>>> compression is complete. Correspondingly, compression takes significantly >>>> longer under Dragonfly, so the CPU is really being under utilized in this >>>> case as opposed to erroneous reporting by top. >>>> >>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >>>> 2016. >>>> >>>> Has anyone else possibly observed this? >>>> >>>> -- >>>> Mike >>>> >>> >> > > From dillon at backplane.com Sun Dec 11 15:37:36 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sun, 11 Dec 2016 15:37:36 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: That doesn't make any sense. It sounds like it is just compressing more slowly, so there is less idle time because the HDD/SSD is able to keep up due to it compressing more slowly. You don't want to turn off hyperthreading in the BIOS and cache coherency stalls will not show up in the idle% anyway. -Matt On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < ipc at peercorpstrust.org> wrote: > Hi, > > It turns out that it was a combination of two things - turning off > hyperthreading in BIOS and using a faster disk. > > I found a post from the author of lbzip2 which seems to describe what > might be happening in this case, but reference was made to a user using an > i5 mobile CPU: > > ######################################################################## > > "bzip2 author here. I strongly suspect that you see what you see because > your Intel core i5 is probably only dual core PLUS hyper-threaded, not real > quad-core. Meaning, you have two instances of the L2 per-core cache, not > four, and each two hyperthreads share an L2 cache. > > Since the bzip2 compression/decompression is very cache sensitive (see > "man bzip2"), the scaling factor will be determined mostly by how many > OS-threads can dispose over a dedicated cache each. In your case this > number is probably 2. > > Since you run two threads per core, those contend for the shared L2 cache, > basically each messing with the other (flushing / invalidating the shared > cache for the other). This contention shows up as double CPU time, because > "waiting for cache" (or "waiting for main memory") is accounted for as CPU > time. > > Hyperthreading is not useful but detrimental for lbzip2; so you should > export LBZIP2="-n 2". You should not run more worker threads per core than: > core-dedicated-cache-size divided by 8MB." > > ######################################################################## > > Running the compression again on the same file from an SSD with > hyperthreading turned off, I was able to fully saturate all of the cores > using lbzip2. None of this seemed obvious at first, but it rectified the > situation. The biggest difference came from turning off hyperthreading > (idle CPU - 20% vs the previous 90%) and then running from an SSD with > hyperthreading turned off (idle CPU = 0%). > > Previously, the compression was run from a single HDD, not an SSD. > Concerning the compression test using the same HDD under FreeBSD, well I > don't know why it was able to saturate the CPU. Perhaps it has something to > do with ZFS's aggressive caching. Turning that off and re-running the test > would likely answer the question. Pixz performed similarly when the above > two modifications were made. > > > > > On 12/11/2016 02:00 AM, Jasse Jansson wrote: > >> Have you tried to disable hypertreads in the BIOS ??? >> It's a long shot, I know, but it might help. >> >> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> >>> Hi, >>> >>> On both systems HAMMER was used. One small correction concerning the >>> 2c/2t machine, both compression programs did effectively utilize that CPU >>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>> isn't effectively maxed out. I'll continue to try and investigate why and >>> report back if I find anything. >>> >>> >>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> >>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>> surprised if that made a difference, but it might? >>>> >>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>> do not >>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>> does. >>>>> >>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>> for >>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>> until >>>>> compression is complete. Correspondingly, compression takes >>>>> significantly >>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>> this >>>>> case as opposed to erroneous reporting by top. >>>>> >>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>> recent >>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>> EET >>>>> 2016. >>>>> >>>>> Has anyone else possibly observed this? >>>>> >>>>> -- >>>>> Mike >>>>> >>>>> >>>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Sun Dec 11 22:14:29 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 12 Dec 2016 08:14:29 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Thanks for this. Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on? On 12/12/2016 01:37 AM, Matthew Dillon wrote: > That doesn't make any sense. It sounds like it is just compressing more > slowly, so there is less idle time because the HDD/SSD is able to keep up > due to it compressing more slowly. You don't want to turn off > hyperthreading in the BIOS and cache coherency stalls will not show up in > the idle% anyway. > > -Matt > > On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < > ipc at peercorpstrust.org> wrote: > >> Hi, >> >> It turns out that it was a combination of two things - turning off >> hyperthreading in BIOS and using a faster disk. >> >> I found a post from the author of lbzip2 which seems to describe what >> might be happening in this case, but reference was made to a user using an >> i5 mobile CPU: >> >> ######################################################################## >> >> "bzip2 author here. I strongly suspect that you see what you see because >> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real >> quad-core. Meaning, you have two instances of the L2 per-core cache, not >> four, and each two hyperthreads share an L2 cache. >> >> Since the bzip2 compression/decompression is very cache sensitive (see >> "man bzip2"), the scaling factor will be determined mostly by how many >> OS-threads can dispose over a dedicated cache each. In your case this >> number is probably 2. >> >> Since you run two threads per core, those contend for the shared L2 cache, >> basically each messing with the other (flushing / invalidating the shared >> cache for the other). This contention shows up as double CPU time, because >> "waiting for cache" (or "waiting for main memory") is accounted for as CPU >> time. >> >> Hyperthreading is not useful but detrimental for lbzip2; so you should >> export LBZIP2="-n 2". You should not run more worker threads per core than: >> core-dedicated-cache-size divided by 8MB." >> >> ######################################################################## >> >> Running the compression again on the same file from an SSD with >> hyperthreading turned off, I was able to fully saturate all of the cores >> using lbzip2. None of this seemed obvious at first, but it rectified the >> situation. The biggest difference came from turning off hyperthreading >> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >> hyperthreading turned off (idle CPU = 0%). >> >> Previously, the compression was run from a single HDD, not an SSD. >> Concerning the compression test using the same HDD under FreeBSD, well I >> don't know why it was able to saturate the CPU. Perhaps it has something to >> do with ZFS's aggressive caching. Turning that off and re-running the test >> would likely answer the question. Pixz performed similarly when the above >> two modifications were made. >> >> >> >> >> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >> >>> Have you tried to disable hypertreads in the BIOS ??? >>> It's a long shot, I know, but it might help. >>> >>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>> >>>> Hi, >>>> >>>> On both systems HAMMER was used. One small correction concerning the >>>> 2c/2t machine, both compression programs did effectively utilize that CPU >>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>>> isn't effectively maxed out. I'll continue to try and investigate why and >>>> report back if I find anything. >>>> >>>> >>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>> >>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>> surprised if that made a difference, but it might? >>>>> >>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>>> do not >>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>>> does. >>>>>> >>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>>> for >>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>>> until >>>>>> compression is complete. Correspondingly, compression takes >>>>>> significantly >>>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>>> this >>>>>> case as opposed to erroneous reporting by top. >>>>>> >>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>> recent >>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>>> EET >>>>>> 2016. >>>>>> >>>>>> Has anyone else possibly observed this? >>>>>> >>>>>> -- >>>>>> Mike >>>>>> >>>>>> >>>>> >>>> >>> >>> > From jasse at yberwaffe.com Mon Dec 12 05:41:40 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Mon, 12 Dec 2016 14:41:40 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> They recommend to turn hyperthreading off if you run studio software on your computer. That's if you run Windows, I have no idea if HT affects a Unix derivative anyway. On 2016-12-12 07:14, PeerCorps Trust Fund wrote: > Thanks for this. > > Is there ever a side case where hyperthreading might have > unpredictable results or should it generally always be left on? > > On 12/12/2016 01:37 AM, Matthew Dillon wrote: >> That doesn't make any sense. It sounds like it is just compressing more >> slowly, so there is less idle time because the HDD/SSD is able to >> keep up >> due to it compressing more slowly. You don't want to turn off >> hyperthreading in the BIOS and cache coherency stalls will not show >> up in >> the idle% anyway. >> >> -Matt >> >> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < >> ipc at peercorpstrust.org> wrote: >> >>> Hi, >>> >>> It turns out that it was a combination of two things - turning off >>> hyperthreading in BIOS and using a faster disk. >>> >>> I found a post from the author of lbzip2 which seems to describe what >>> might be happening in this case, but reference was made to a user >>> using an >>> i5 mobile CPU: >>> >>> ######################################################################## >>> >>> >>> "bzip2 author here. I strongly suspect that you see what you see >>> because >>> your Intel core i5 is probably only dual core PLUS hyper-threaded, >>> not real >>> quad-core. Meaning, you have two instances of the L2 per-core cache, >>> not >>> four, and each two hyperthreads share an L2 cache. >>> >>> Since the bzip2 compression/decompression is very cache sensitive (see >>> "man bzip2"), the scaling factor will be determined mostly by how many >>> OS-threads can dispose over a dedicated cache each. In your case this >>> number is probably 2. >>> >>> Since you run two threads per core, those contend for the shared L2 >>> cache, >>> basically each messing with the other (flushing / invalidating the >>> shared >>> cache for the other). This contention shows up as double CPU time, >>> because >>> "waiting for cache" (or "waiting for main memory") is accounted for >>> as CPU >>> time. >>> >>> Hyperthreading is not useful but detrimental for lbzip2; so you should >>> export LBZIP2="-n 2". You should not run more worker threads per >>> core than: >>> core-dedicated-cache-size divided by 8MB." >>> >>> ######################################################################## >>> >>> >>> Running the compression again on the same file from an SSD with >>> hyperthreading turned off, I was able to fully saturate all of the >>> cores >>> using lbzip2. None of this seemed obvious at first, but it rectified >>> the >>> situation. The biggest difference came from turning off hyperthreading >>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >>> hyperthreading turned off (idle CPU = 0%). >>> >>> Previously, the compression was run from a single HDD, not an SSD. >>> Concerning the compression test using the same HDD under FreeBSD, >>> well I >>> don't know why it was able to saturate the CPU. Perhaps it has >>> something to >>> do with ZFS's aggressive caching. Turning that off and re-running >>> the test >>> would likely answer the question. Pixz performed similarly when the >>> above >>> two modifications were made. >>> >>> >>> >>> >>> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >>> >>>> Have you tried to disable hypertreads in the BIOS ??? >>>> It's a long shot, I know, but it might help. >>>> >>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>>> >>>>> Hi, >>>>> >>>>> On both systems HAMMER was used. One small correction concerning the >>>>> 2c/2t machine, both compression programs did effectively utilize >>>>> that CPU >>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t >>>>> where the CPU >>>>> isn't effectively maxed out. I'll continue to try and investigate >>>>> why and >>>>> report back if I find anything. >>>>> >>>>> >>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>>> >>>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>>> surprised if that made a difference, but it might? >>>>>> >>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I've observed that parallel compression tools such as pixz and >>>>>>> lbzip2 >>>>>>> do not >>>>>>> make use of all of the available CPU under Dragonfly. On other >>>>>>> OSes, it >>>>>>> does. >>>>>>> >>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>>> percentages consistently hover around the 90% range for pixz and >>>>>>> ~70% >>>>>>> for >>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>>>>>> idle >>>>>>> until >>>>>>> compression is complete. Correspondingly, compression takes >>>>>>> significantly >>>>>>> longer under Dragonfly, so the CPU is really being under >>>>>>> utilized in >>>>>>> this >>>>>>> case as opposed to erroneous reporting by top. >>>>>>> >>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>>> recent >>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>>>>>> 11:44:04 >>>>>>> EET >>>>>>> 2016. >>>>>>> >>>>>>> Has anyone else possibly observed this? >>>>>>> >>>>>>> -- >>>>>>> Mike >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >> > From dillon at backplane.com Tue Dec 13 12:56:41 2016 From: dillon at backplane.com (Matthew Dillon) Date: Tue, 13 Dec 2016 12:56:41 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> Message-ID: Never turn hyper-threading off if running (any) BSD or Linux. The schedulers understand hyperthreads. In terms of compression programs, presumably they have an option to specify how many threads to create to do the compression. Simply test with 1, 2, and 4 threads to determine the optimal number. -Matt ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:10:35 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:10:35 -0800 Subject: Graphics-related memory leak fixed in master for post-haswell machines Message-ID: A memory leak has been fixed in master's drm (gpu) code which affects post-haswell intel GPUs. i.e. Broadwell, Skylake, etc. This leak would cause a OOM panic typically after a couple of weeks of uptime. The leak does not affect older Haswell based intel systems. A number of other stability improvements have also gone into the system. The release branch got some of the stability work but the fix for the graphics memory leak is a bit too complex to backport in a compatible manner. At the current juncture, anyone running a workstation with release might want to switch to master. Master will be *much* better on broadwell or later GPUs. Also somewhat related, chromium (chrome) has been updated from v52 to v54 on both branches, which fixes some serious GUI issues on high-dpi (aka 4K) monitors. Chrome 54 still has some issues with certain sites, but generally speaking it seems to be pretty stable and I recommend its use. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:19:31 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:19:31 -0800 Subject: Setting up a more secure browser environment Message-ID: I have created a new page in the handbook describing how to set up a secure browser environment using isolation accounts. We recommend running your browser this way instead of running it directly from your main workstation account. https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/ Currently chrome is considered to be the most secure. Now that chrome 54 is in dports, it should generate a pretty good experience for users. There are some sites which still implode on it (and in FreeBSD too, that isn't a DFly-specific problem), and for those you may have to run firefox. My personal preference is chrome with some adjustments to the setup. For people running chrome on a 4K monitor, I recommend running chrome with the "--force-device-scale-factor=1.5" option before you start messing with font or other settings. It will scale the UI a little closer to what you probably want. DragonFly master's support through Intel Skylake is now very good. I am running a Skylake box as my workstation now and it has been very stable. Theoretically we have some Karby support in there too but no guarantees on stability. Radeon support has also seen some improvement though not at the same pace as Intel GPU support has seen. At this point, DragonFly runs extremely well even on mobile chipsets (though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in terms of performance). I recommend a minimum of 4GB of ram if using X. DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or NVMe SSD and a large swap partition to page to for decent browser performance. Browsers are still big memory hogs. Eventually we will have chrome 55 support which is supposed to eat much less memory, but not yet. chrome 54 is where we are now. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From karu.pruun at gmail.com Mon Dec 19 05:21:56 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Mon, 19 Dec 2016 15:21:56 +0200 Subject: question about hammer mirror-stream Message-ID: Hello A quick question regarding hammer mirror-streaming. The master and slave are mounted as # mount_null /pfs/master /home/master # mount_null /pfs/slave /home/slave and streaming is ongoing with # hammer mirror-stream /home/master /home/slave When there's a change in master, e.g. % sudo touch /home/master/fileX then fileX will show up in /pfs/slave but not in /home/slave. It will show up in the latter when I stop streaming, unmount the slave and mount it again. Is this the intended behavior? Thanks Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Mon Dec 19 08:50:06 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Mon, 19 Dec 2016 09:50:06 -0700 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > then fileX will show up in /pfs/slave but not in /home/slave. It will show > up in the latter when I stop streaming, unmount the slave and mount it > again. Is this the intended behavior? > Yes, that's how it works. You won't see the updates to the mounted slave until you unmount and re-mount. Tim? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Mon Dec 19 10:35:29 2016 From: dillon at backplane.com (Matthew Dillon) Date: Mon, 19 Dec 2016 10:35:29 -0800 Subject: AHCI driver on master - heads up Message-ID: Some work has gone into the AHCI driver on master. It should not affect anyone, but I'm posting a heads-up because this is the primary SATA storage device used on most machines and the changes made are significant enough that they could conceivably cause an issue. So if you are using master, be sure you have a good backup kernel in case something goes wrong. The easiest way to create a backup kernel is to first install the new kernel. Your previous kernel will be stripped and copied to /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: cpdup /boot/kernel.old /boot/kernel.bak This way you don't waste space in /boot copying kernels and modules with full debug symbols. Having a kernel.bak is a good idea because it will never be overwritten by an installkernel or other build target. It will show up in the loader under 'b' for backup kernel. -- Experimental FIS-Based Switching support has been added to the AHCI driver's port-multiplier code. Most AHCI chipsets do not have this capability and the few that do tend to have fairly buggy hardware, so I am not recommending that port multipliers be used generally. However, after years of waiting I finally got my hands on a mobo that had the feature and I've wanted to implement it forever, so I have done so. The FBS feature allows the computer to queue concurrent commands to multiple targets behind a port-multiplier. Without it, only one target can be addressed at a time. Test results with some old HDDs I had on a shelf can be found here. As expected, concurrent performance to multiple targets is vastly improved. http://apollo.backplane.com/DFlyMisc/portmult01.txt However, I must again stress that AHCI-based port-multipliers tend to fall-over when errors occur, or even if you are just hot-swapping a drive sometimes. Using a port-multiplier is not a good solution for serious setups. Blame Intel for intentionally screwing up the consumer AHCI spec in order to prevent it from competing with commercial SAS controllers. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Mon Dec 19 11:14:07 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 19 Dec 2016 21:14:07 +0200 Subject: AHCI driver on master - heads up In-Reply-To: References: Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org> This is interesting. AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you! On 12/19/2016 08:35 PM, Matthew Dillon wrote: > Some work has gone into the AHCI driver on master. It should not affect > anyone, but I'm posting a heads-up because this is the primary SATA storage > device used on most machines and the changes made are significant enough > that they could conceivably cause an issue. So if you are using master, be > sure you have a good backup kernel in case something goes wrong. > > The easiest way to create a backup kernel is to first install the new > kernel. Your previous kernel will be stripped and copied to > /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: > > cpdup /boot/kernel.old /boot/kernel.bak > > This way you don't waste space in /boot copying kernels and modules with > full debug symbols. Having a kernel.bak is a good idea because it will > never be overwritten by an installkernel or other build target. It will > show up in the loader under 'b' for backup kernel. > > -- > > Experimental FIS-Based Switching support has been added to the AHCI > driver's port-multiplier code. Most AHCI chipsets do not have this > capability and the few that do tend to have fairly buggy hardware, so I am > not recommending that port multipliers be used generally. However, after > years of waiting I finally got my hands on a mobo that had the feature and > I've wanted to implement it forever, so I have done so. > > The FBS feature allows the computer to queue concurrent commands to > multiple targets behind a port-multiplier. Without it, only one target can > be addressed at a time. Test results with some old HDDs I had on a shelf > can be found here. As expected, concurrent performance to multiple targets > is vastly improved. > > http://apollo.backplane.com/DFlyMisc/portmult01.txt > > However, I must again stress that AHCI-based port-multipliers tend to > fall-over when errors occur, or even if you are just hot-swapping a drive > sometimes. Using a port-multiplier is not a good solution for serious > setups. Blame Intel for intentionally screwing up the consumer AHCI spec > in order to prevent it from competing with commercial SAS controllers. > > -Matt > From karu.pruun at gmail.com Thu Dec 22 05:11:17 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Thu, 22 Dec 2016 15:11:17 +0200 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby wrote: > On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > >> then fileX will show up in /pfs/slave but not in /home/slave. It will >> show up in the latter when I stop streaming, unmount the slave and mount it >> again. Is this the intended behavior? >> > > Yes, that's how it works. You won't see the updates to the mounted slave > until you unmount and re-mount. > Thanks! Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent.delft at gmail.com Mon Dec 26 03:30:12 2016 From: vincent.delft at gmail.com (vincent delft) Date: Mon, 26 Dec 2016 12:30:12 +0100 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP Message-ID: Hello, First of all I'm new to DF. Thanks for the nice install process. It's a quite easy to have DF running. But my concerns are more linked to HammerFS. My context is the following: - I would like to have a NAS system running on a small machine: 4GB RAM having a celeron CPU of 1.6GHz. - This NAS host +- 700GB of data - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB outside the NAS for DRP reasons. - I'm looking for a solution tackling the bit-rot problem. Concerning HammerFS, I've still read lot of DF manuals, pages, ... : http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html https://www.dragonflybsd.org/hammer/ So basically, I've understood that I must install DF on a small SSD disk and will dedicated the 1TB disk to HammerFS. But I have some questions: - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 slices on the disk? - For DRP reasons, how can I perform a backup on an external disk ? (the disk is only connected 1x per month on the NAS). Should I do a "dd" ? should I use the master-slave concepts ? Should I use cpdup ? Many thanks for your replies. Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From kusumi.tomohiro at gmail.com Mon Dec 26 04:15:08 2016 From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi) Date: Mon, 26 Dec 2016 21:15:08 +0900 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: In short, hammer doesn't solve (or fix) what you call bit-rot. 2016-12-26 20:30 GMT+09:00 vincent delft : > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk and > will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 > or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? > Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 > slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? should > I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > > From ipc at peercorpstrust.org Mon Dec 26 05:06:24 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 26 Dec 2016 15:06:24 +0200 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org> Hi Vincent, Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine. Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on. HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication. Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company: https://bsdmag.org/siju_george/ HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs. There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/ In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also. If any of the above is in error, I imagine someone with deeper insight will chime in. Mike On 12/26/2016 01:30 PM, vincent delft wrote: > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk > and will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be > Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 > filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or > should I create 2 slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? > should I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > From dillon at backplane.com Wed Dec 28 14:32:55 2016 From: dillon at backplane.com (Matthew Dillon) Date: Wed, 28 Dec 2016 14:32:55 -0800 Subject: Swap changes in master, enforcement of RLIMIT_RSS Message-ID: The maximum supported swap space has been significantly increased in master. The amount of core memory required to manage swap is approximately 1:128 (ram:swap-used). The system allows up to half of your core ram to be used for swap management so the maximum swap you can actually use on a machine would be as follows: With 1GB of ram it would be 512M*128 = 64GB of swap. With 8GB of ram it would be 4G*128 = 512GB of swap. With 128GB of ram it would be around ~8TB of swap. The maximum the system supports due to KVM limitations is around ~32TB of swap. While you can configure more swap than the above limitations, the system might not be able to actually manage as much as you configure. I'm considering improving the management code to reduce its memory requirements. Also, people shouldn't go overboard w/regards to configuring swap space. Configured swap does eat some memory resources even when not actively in-use, and there might not be a practical use for having such massive amounts of swap with your setup. -- The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced on a per-process basis. By default this limit is infinity so nothing will change for most people. Setting a memoryuse limit has advantages in certain circumstances as long as it isn't over-used. For example, it might be reasonable to start your browser with a 500m (500 megabyte) limit. The system will allow each process VSZ to grow larger, but will force the RSS to the limit and devalue the pages it removes to make them more likely to be recycled by the system. There is a sysctl, vm.pageout_memuse_mode, to control the behavior. It can be set to 0, 1, or 2, and defaults to 1. Setting it to 0 disables the enforcement (system operates the same as it did before the commit). The default value of 1 enables enforcement by removing pages from the process pmap and deactivating them passively. The value 2 enables enforcement and actively pages data out and frees pages. Mode 2 can cause excessive paging and wear out your SSD so it is not currently recommended. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Thu Dec 1 08:27:40 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Thu, 1 Dec 2016 09:27:40 -0700 Subject: Notice to Golang users Message-ID: If you use the Go compiler with Dragonfly (it's in dports), be aware that the upcoming release of Go 1.8 will require DF 4.4.4 or later: https://beta.golang.org/doc/go1.8 ?The Go team had put a workaround in place for DF for a thread signal handling bug. This was recently fixed by Matt, so they've removed the workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you can use Go 1.7 or earlier. Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From venture37 at geeklan.co.uk Sun Dec 4 07:56:48 2016 From: venture37 at geeklan.co.uk (Sevan Janiyan) Date: Sun, 4 Dec 2016 15:56:48 +0000 Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?= =?UTF-8?Q?icipation_extended?= Message-ID: Hi, The BSD devroom move the submission deadline to Dec 10th. There's still time to submit a talk ! Regards, - rodrigo On behalf of the BSD devroom Fosdem 2017 BSD devroom Call for Participation ============================================== Important dates -------------------- * Conference date : 4 & 5 February 2017 in Brussels, Belgium * Devroom date : Sunday 5 February 2016 * Submission deadline : Sunday 27 November 2016 * Speaker notified : Sunday 4 December 2016 The topic of the devroom includes all BSD operating systems and every talk is welcome from hacker discussions to real-word examples and presentations about new and shiny features. Practical -------------------- * The default duration for talks will be 45 minutes including discussion. Feel free to ask if you want to have a longer or a shorter slot. * Presentations can be recorded and streamed, sending your proposal implies giving permission to be recorded. However, exceptions can be made for exceptional circumstances. To submit your proposal, visit : https://penta.fosdem.org/submission/FOSDEM17 Account creation ---------------------- If you already have a Pentabarf account, please *don't* recreate a new one. If you forgot your password, reset it. If not, follow the instructions to create an account. Submit a talk ---------------------- Create an ?event? and click on "Show all" in the top right corner to display the full form. Your submission must include the following information * The title and subtitle of your talk (please be descriptive, as titles will be listed with ~500 from other projects) * select ?BSD devroom? as the track. * A short abstract of one paragraph * A longer description if you wish to do so * Links to related websites/blogs etc. - Rodrigo Osorio On behalf of the BSD Devroom From ipc at peercorpstrust.org Sat Dec 10 03:19:14 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 13:19:14 +0200 Subject: Parallel compression not using all available CPU Message-ID: Hi, I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does. When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top. This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET 2016. Has anyone else possibly observed this? -- Mike From justin at shiningsilence.com Sat Dec 10 12:26:32 2016 From: justin at shiningsilence.com (Justin Sherrill) Date: Sat, 10 Dec 2016 15:26:32 -0500 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: On the two DragonFly systems, was it Hammer or UFS? I would be surprised if that made a difference, but it might? On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund wrote: > Hi, > > I've observed that parallel compression tools such as pixz and lbzip2 do not > make use of all of the available CPU under Dragonfly. On other OSes, it > does. > > When testing on a 50 gb file, using top I've observed that CPU idle > percentages consistently hover around the 90% range for pixz and ~70% for > lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until > compression is complete. Correspondingly, compression takes significantly > longer under Dragonfly, so the CPU is really being under utilized in this > case as opposed to erroneous reporting by top. > > This was tested on two systems, one 16c/32t and a 2c/2t system on a recent > master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET > 2016. > > Has anyone else possibly observed this? > > -- > Mike > From ipc at peercorpstrust.org Sat Dec 10 13:14:26 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 23:14:26 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Hi, On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. On 12/10/2016 10:26 PM, Justin Sherrill wrote: > On the two DragonFly systems, was it Hammer or UFS? I would be > surprised if that made a difference, but it might? > > On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund > wrote: >> Hi, >> >> I've observed that parallel compression tools such as pixz and lbzip2 do not >> make use of all of the available CPU under Dragonfly. On other OSes, it >> does. >> >> When testing on a 50 gb file, using top I've observed that CPU idle >> percentages consistently hover around the 90% range for pixz and ~70% for >> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >> compression is complete. Correspondingly, compression takes significantly >> longer under Dragonfly, so the CPU is really being under utilized in this >> case as opposed to erroneous reporting by top. >> >> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >> 2016. >> >> Has anyone else possibly observed this? >> >> -- >> Mike >> > From dillon at apollo.backplane.com Sat Dec 10 13:29:24 2016 From: dillon at apollo.backplane.com (Matthew Dillon) Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST) Subject: Parallel compression not using all available CPU References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com> :Hi, : :On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. Generally speaking I wouldn't want these utilities to use all available cpu threads by default. It's not usually a good idea for a program to just assume that all the cpus are there for it (by default). But they should have command line options that allow the number of threads to be specified. -Matt Matthew Dillon From jasse at yberwaffe.com Sat Dec 10 16:00:02 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Sun, 11 Dec 2016 01:00:02 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Have you tried to disable hypertreads in the BIOS ??? It's a long shot, I know, but it might help. On 2016-12-10 22:14, PeerCorps Trust Fund wrote: > Hi, > > On both systems HAMMER was used. One small correction concerning the > 2c/2t machine, both compression programs did effectively utilize that > CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t > where the CPU isn't effectively maxed out. I'll continue to try and > investigate why and report back if I find anything. > > > On 12/10/2016 10:26 PM, Justin Sherrill wrote: >> On the two DragonFly systems, was it Hammer or UFS? I would be >> surprised if that made a difference, but it might? >> >> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >> wrote: >>> Hi, >>> >>> I've observed that parallel compression tools such as pixz and >>> lbzip2 do not >>> make use of all of the available CPU under Dragonfly. On other OSes, it >>> does. >>> >>> When testing on a 50 gb file, using top I've observed that CPU idle >>> percentages consistently hover around the 90% range for pixz and >>> ~70% for >>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>> idle until >>> compression is complete. Correspondingly, compression takes >>> significantly >>> longer under Dragonfly, so the CPU is really being under utilized in >>> this >>> case as opposed to erroneous reporting by top. >>> >>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>> recent >>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>> 11:44:04 EET >>> 2016. >>> >>> Has anyone else possibly observed this? >>> >>> -- >>> Mike >>> >> > From ipc at peercorpstrust.org Sun Dec 11 13:22:44 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sun, 11 Dec 2016 23:22:44 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Hi, It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk. I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU: ######################################################################## "bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache. Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2. Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time. Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB." ######################################################################## Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%). Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made. On 12/11/2016 02:00 AM, Jasse Jansson wrote: > Have you tried to disable hypertreads in the BIOS ??? > It's a long shot, I know, but it might help. > > On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> Hi, >> >> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. >> >> >> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> On the two DragonFly systems, was it Hammer or UFS? I would be >>> surprised if that made a difference, but it might? >>> >>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>> wrote: >>>> Hi, >>>> >>>> I've observed that parallel compression tools such as pixz and lbzip2 do not >>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>> does. >>>> >>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>> percentages consistently hover around the 90% range for pixz and ~70% for >>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >>>> compression is complete. Correspondingly, compression takes significantly >>>> longer under Dragonfly, so the CPU is really being under utilized in this >>>> case as opposed to erroneous reporting by top. >>>> >>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >>>> 2016. >>>> >>>> Has anyone else possibly observed this? >>>> >>>> -- >>>> Mike >>>> >>> >> > > From dillon at backplane.com Sun Dec 11 15:37:36 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sun, 11 Dec 2016 15:37:36 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: That doesn't make any sense. It sounds like it is just compressing more slowly, so there is less idle time because the HDD/SSD is able to keep up due to it compressing more slowly. You don't want to turn off hyperthreading in the BIOS and cache coherency stalls will not show up in the idle% anyway. -Matt On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < ipc at peercorpstrust.org> wrote: > Hi, > > It turns out that it was a combination of two things - turning off > hyperthreading in BIOS and using a faster disk. > > I found a post from the author of lbzip2 which seems to describe what > might be happening in this case, but reference was made to a user using an > i5 mobile CPU: > > ######################################################################## > > "bzip2 author here. I strongly suspect that you see what you see because > your Intel core i5 is probably only dual core PLUS hyper-threaded, not real > quad-core. Meaning, you have two instances of the L2 per-core cache, not > four, and each two hyperthreads share an L2 cache. > > Since the bzip2 compression/decompression is very cache sensitive (see > "man bzip2"), the scaling factor will be determined mostly by how many > OS-threads can dispose over a dedicated cache each. In your case this > number is probably 2. > > Since you run two threads per core, those contend for the shared L2 cache, > basically each messing with the other (flushing / invalidating the shared > cache for the other). This contention shows up as double CPU time, because > "waiting for cache" (or "waiting for main memory") is accounted for as CPU > time. > > Hyperthreading is not useful but detrimental for lbzip2; so you should > export LBZIP2="-n 2". You should not run more worker threads per core than: > core-dedicated-cache-size divided by 8MB." > > ######################################################################## > > Running the compression again on the same file from an SSD with > hyperthreading turned off, I was able to fully saturate all of the cores > using lbzip2. None of this seemed obvious at first, but it rectified the > situation. The biggest difference came from turning off hyperthreading > (idle CPU - 20% vs the previous 90%) and then running from an SSD with > hyperthreading turned off (idle CPU = 0%). > > Previously, the compression was run from a single HDD, not an SSD. > Concerning the compression test using the same HDD under FreeBSD, well I > don't know why it was able to saturate the CPU. Perhaps it has something to > do with ZFS's aggressive caching. Turning that off and re-running the test > would likely answer the question. Pixz performed similarly when the above > two modifications were made. > > > > > On 12/11/2016 02:00 AM, Jasse Jansson wrote: > >> Have you tried to disable hypertreads in the BIOS ??? >> It's a long shot, I know, but it might help. >> >> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> >>> Hi, >>> >>> On both systems HAMMER was used. One small correction concerning the >>> 2c/2t machine, both compression programs did effectively utilize that CPU >>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>> isn't effectively maxed out. I'll continue to try and investigate why and >>> report back if I find anything. >>> >>> >>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> >>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>> surprised if that made a difference, but it might? >>>> >>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>> do not >>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>> does. >>>>> >>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>> for >>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>> until >>>>> compression is complete. Correspondingly, compression takes >>>>> significantly >>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>> this >>>>> case as opposed to erroneous reporting by top. >>>>> >>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>> recent >>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>> EET >>>>> 2016. >>>>> >>>>> Has anyone else possibly observed this? >>>>> >>>>> -- >>>>> Mike >>>>> >>>>> >>>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Sun Dec 11 22:14:29 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 12 Dec 2016 08:14:29 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Thanks for this. Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on? On 12/12/2016 01:37 AM, Matthew Dillon wrote: > That doesn't make any sense. It sounds like it is just compressing more > slowly, so there is less idle time because the HDD/SSD is able to keep up > due to it compressing more slowly. You don't want to turn off > hyperthreading in the BIOS and cache coherency stalls will not show up in > the idle% anyway. > > -Matt > > On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < > ipc at peercorpstrust.org> wrote: > >> Hi, >> >> It turns out that it was a combination of two things - turning off >> hyperthreading in BIOS and using a faster disk. >> >> I found a post from the author of lbzip2 which seems to describe what >> might be happening in this case, but reference was made to a user using an >> i5 mobile CPU: >> >> ######################################################################## >> >> "bzip2 author here. I strongly suspect that you see what you see because >> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real >> quad-core. Meaning, you have two instances of the L2 per-core cache, not >> four, and each two hyperthreads share an L2 cache. >> >> Since the bzip2 compression/decompression is very cache sensitive (see >> "man bzip2"), the scaling factor will be determined mostly by how many >> OS-threads can dispose over a dedicated cache each. In your case this >> number is probably 2. >> >> Since you run two threads per core, those contend for the shared L2 cache, >> basically each messing with the other (flushing / invalidating the shared >> cache for the other). This contention shows up as double CPU time, because >> "waiting for cache" (or "waiting for main memory") is accounted for as CPU >> time. >> >> Hyperthreading is not useful but detrimental for lbzip2; so you should >> export LBZIP2="-n 2". You should not run more worker threads per core than: >> core-dedicated-cache-size divided by 8MB." >> >> ######################################################################## >> >> Running the compression again on the same file from an SSD with >> hyperthreading turned off, I was able to fully saturate all of the cores >> using lbzip2. None of this seemed obvious at first, but it rectified the >> situation. The biggest difference came from turning off hyperthreading >> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >> hyperthreading turned off (idle CPU = 0%). >> >> Previously, the compression was run from a single HDD, not an SSD. >> Concerning the compression test using the same HDD under FreeBSD, well I >> don't know why it was able to saturate the CPU. Perhaps it has something to >> do with ZFS's aggressive caching. Turning that off and re-running the test >> would likely answer the question. Pixz performed similarly when the above >> two modifications were made. >> >> >> >> >> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >> >>> Have you tried to disable hypertreads in the BIOS ??? >>> It's a long shot, I know, but it might help. >>> >>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>> >>>> Hi, >>>> >>>> On both systems HAMMER was used. One small correction concerning the >>>> 2c/2t machine, both compression programs did effectively utilize that CPU >>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>>> isn't effectively maxed out. I'll continue to try and investigate why and >>>> report back if I find anything. >>>> >>>> >>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>> >>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>> surprised if that made a difference, but it might? >>>>> >>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>>> do not >>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>>> does. >>>>>> >>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>>> for >>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>>> until >>>>>> compression is complete. Correspondingly, compression takes >>>>>> significantly >>>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>>> this >>>>>> case as opposed to erroneous reporting by top. >>>>>> >>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>> recent >>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>>> EET >>>>>> 2016. >>>>>> >>>>>> Has anyone else possibly observed this? >>>>>> >>>>>> -- >>>>>> Mike >>>>>> >>>>>> >>>>> >>>> >>> >>> > From jasse at yberwaffe.com Mon Dec 12 05:41:40 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Mon, 12 Dec 2016 14:41:40 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> They recommend to turn hyperthreading off if you run studio software on your computer. That's if you run Windows, I have no idea if HT affects a Unix derivative anyway. On 2016-12-12 07:14, PeerCorps Trust Fund wrote: > Thanks for this. > > Is there ever a side case where hyperthreading might have > unpredictable results or should it generally always be left on? > > On 12/12/2016 01:37 AM, Matthew Dillon wrote: >> That doesn't make any sense. It sounds like it is just compressing more >> slowly, so there is less idle time because the HDD/SSD is able to >> keep up >> due to it compressing more slowly. You don't want to turn off >> hyperthreading in the BIOS and cache coherency stalls will not show >> up in >> the idle% anyway. >> >> -Matt >> >> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < >> ipc at peercorpstrust.org> wrote: >> >>> Hi, >>> >>> It turns out that it was a combination of two things - turning off >>> hyperthreading in BIOS and using a faster disk. >>> >>> I found a post from the author of lbzip2 which seems to describe what >>> might be happening in this case, but reference was made to a user >>> using an >>> i5 mobile CPU: >>> >>> ######################################################################## >>> >>> >>> "bzip2 author here. I strongly suspect that you see what you see >>> because >>> your Intel core i5 is probably only dual core PLUS hyper-threaded, >>> not real >>> quad-core. Meaning, you have two instances of the L2 per-core cache, >>> not >>> four, and each two hyperthreads share an L2 cache. >>> >>> Since the bzip2 compression/decompression is very cache sensitive (see >>> "man bzip2"), the scaling factor will be determined mostly by how many >>> OS-threads can dispose over a dedicated cache each. In your case this >>> number is probably 2. >>> >>> Since you run two threads per core, those contend for the shared L2 >>> cache, >>> basically each messing with the other (flushing / invalidating the >>> shared >>> cache for the other). This contention shows up as double CPU time, >>> because >>> "waiting for cache" (or "waiting for main memory") is accounted for >>> as CPU >>> time. >>> >>> Hyperthreading is not useful but detrimental for lbzip2; so you should >>> export LBZIP2="-n 2". You should not run more worker threads per >>> core than: >>> core-dedicated-cache-size divided by 8MB." >>> >>> ######################################################################## >>> >>> >>> Running the compression again on the same file from an SSD with >>> hyperthreading turned off, I was able to fully saturate all of the >>> cores >>> using lbzip2. None of this seemed obvious at first, but it rectified >>> the >>> situation. The biggest difference came from turning off hyperthreading >>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >>> hyperthreading turned off (idle CPU = 0%). >>> >>> Previously, the compression was run from a single HDD, not an SSD. >>> Concerning the compression test using the same HDD under FreeBSD, >>> well I >>> don't know why it was able to saturate the CPU. Perhaps it has >>> something to >>> do with ZFS's aggressive caching. Turning that off and re-running >>> the test >>> would likely answer the question. Pixz performed similarly when the >>> above >>> two modifications were made. >>> >>> >>> >>> >>> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >>> >>>> Have you tried to disable hypertreads in the BIOS ??? >>>> It's a long shot, I know, but it might help. >>>> >>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>>> >>>>> Hi, >>>>> >>>>> On both systems HAMMER was used. One small correction concerning the >>>>> 2c/2t machine, both compression programs did effectively utilize >>>>> that CPU >>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t >>>>> where the CPU >>>>> isn't effectively maxed out. I'll continue to try and investigate >>>>> why and >>>>> report back if I find anything. >>>>> >>>>> >>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>>> >>>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>>> surprised if that made a difference, but it might? >>>>>> >>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I've observed that parallel compression tools such as pixz and >>>>>>> lbzip2 >>>>>>> do not >>>>>>> make use of all of the available CPU under Dragonfly. On other >>>>>>> OSes, it >>>>>>> does. >>>>>>> >>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>>> percentages consistently hover around the 90% range for pixz and >>>>>>> ~70% >>>>>>> for >>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>>>>>> idle >>>>>>> until >>>>>>> compression is complete. Correspondingly, compression takes >>>>>>> significantly >>>>>>> longer under Dragonfly, so the CPU is really being under >>>>>>> utilized in >>>>>>> this >>>>>>> case as opposed to erroneous reporting by top. >>>>>>> >>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>>> recent >>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>>>>>> 11:44:04 >>>>>>> EET >>>>>>> 2016. >>>>>>> >>>>>>> Has anyone else possibly observed this? >>>>>>> >>>>>>> -- >>>>>>> Mike >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >> > From dillon at backplane.com Tue Dec 13 12:56:41 2016 From: dillon at backplane.com (Matthew Dillon) Date: Tue, 13 Dec 2016 12:56:41 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> Message-ID: Never turn hyper-threading off if running (any) BSD or Linux. The schedulers understand hyperthreads. In terms of compression programs, presumably they have an option to specify how many threads to create to do the compression. Simply test with 1, 2, and 4 threads to determine the optimal number. -Matt ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:10:35 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:10:35 -0800 Subject: Graphics-related memory leak fixed in master for post-haswell machines Message-ID: A memory leak has been fixed in master's drm (gpu) code which affects post-haswell intel GPUs. i.e. Broadwell, Skylake, etc. This leak would cause a OOM panic typically after a couple of weeks of uptime. The leak does not affect older Haswell based intel systems. A number of other stability improvements have also gone into the system. The release branch got some of the stability work but the fix for the graphics memory leak is a bit too complex to backport in a compatible manner. At the current juncture, anyone running a workstation with release might want to switch to master. Master will be *much* better on broadwell or later GPUs. Also somewhat related, chromium (chrome) has been updated from v52 to v54 on both branches, which fixes some serious GUI issues on high-dpi (aka 4K) monitors. Chrome 54 still has some issues with certain sites, but generally speaking it seems to be pretty stable and I recommend its use. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:19:31 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:19:31 -0800 Subject: Setting up a more secure browser environment Message-ID: I have created a new page in the handbook describing how to set up a secure browser environment using isolation accounts. We recommend running your browser this way instead of running it directly from your main workstation account. https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/ Currently chrome is considered to be the most secure. Now that chrome 54 is in dports, it should generate a pretty good experience for users. There are some sites which still implode on it (and in FreeBSD too, that isn't a DFly-specific problem), and for those you may have to run firefox. My personal preference is chrome with some adjustments to the setup. For people running chrome on a 4K monitor, I recommend running chrome with the "--force-device-scale-factor=1.5" option before you start messing with font or other settings. It will scale the UI a little closer to what you probably want. DragonFly master's support through Intel Skylake is now very good. I am running a Skylake box as my workstation now and it has been very stable. Theoretically we have some Karby support in there too but no guarantees on stability. Radeon support has also seen some improvement though not at the same pace as Intel GPU support has seen. At this point, DragonFly runs extremely well even on mobile chipsets (though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in terms of performance). I recommend a minimum of 4GB of ram if using X. DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or NVMe SSD and a large swap partition to page to for decent browser performance. Browsers are still big memory hogs. Eventually we will have chrome 55 support which is supposed to eat much less memory, but not yet. chrome 54 is where we are now. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From karu.pruun at gmail.com Mon Dec 19 05:21:56 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Mon, 19 Dec 2016 15:21:56 +0200 Subject: question about hammer mirror-stream Message-ID: Hello A quick question regarding hammer mirror-streaming. The master and slave are mounted as # mount_null /pfs/master /home/master # mount_null /pfs/slave /home/slave and streaming is ongoing with # hammer mirror-stream /home/master /home/slave When there's a change in master, e.g. % sudo touch /home/master/fileX then fileX will show up in /pfs/slave but not in /home/slave. It will show up in the latter when I stop streaming, unmount the slave and mount it again. Is this the intended behavior? Thanks Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Mon Dec 19 08:50:06 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Mon, 19 Dec 2016 09:50:06 -0700 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > then fileX will show up in /pfs/slave but not in /home/slave. It will show > up in the latter when I stop streaming, unmount the slave and mount it > again. Is this the intended behavior? > Yes, that's how it works. You won't see the updates to the mounted slave until you unmount and re-mount. Tim? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Mon Dec 19 10:35:29 2016 From: dillon at backplane.com (Matthew Dillon) Date: Mon, 19 Dec 2016 10:35:29 -0800 Subject: AHCI driver on master - heads up Message-ID: Some work has gone into the AHCI driver on master. It should not affect anyone, but I'm posting a heads-up because this is the primary SATA storage device used on most machines and the changes made are significant enough that they could conceivably cause an issue. So if you are using master, be sure you have a good backup kernel in case something goes wrong. The easiest way to create a backup kernel is to first install the new kernel. Your previous kernel will be stripped and copied to /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: cpdup /boot/kernel.old /boot/kernel.bak This way you don't waste space in /boot copying kernels and modules with full debug symbols. Having a kernel.bak is a good idea because it will never be overwritten by an installkernel or other build target. It will show up in the loader under 'b' for backup kernel. -- Experimental FIS-Based Switching support has been added to the AHCI driver's port-multiplier code. Most AHCI chipsets do not have this capability and the few that do tend to have fairly buggy hardware, so I am not recommending that port multipliers be used generally. However, after years of waiting I finally got my hands on a mobo that had the feature and I've wanted to implement it forever, so I have done so. The FBS feature allows the computer to queue concurrent commands to multiple targets behind a port-multiplier. Without it, only one target can be addressed at a time. Test results with some old HDDs I had on a shelf can be found here. As expected, concurrent performance to multiple targets is vastly improved. http://apollo.backplane.com/DFlyMisc/portmult01.txt However, I must again stress that AHCI-based port-multipliers tend to fall-over when errors occur, or even if you are just hot-swapping a drive sometimes. Using a port-multiplier is not a good solution for serious setups. Blame Intel for intentionally screwing up the consumer AHCI spec in order to prevent it from competing with commercial SAS controllers. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Mon Dec 19 11:14:07 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 19 Dec 2016 21:14:07 +0200 Subject: AHCI driver on master - heads up In-Reply-To: References: Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org> This is interesting. AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you! On 12/19/2016 08:35 PM, Matthew Dillon wrote: > Some work has gone into the AHCI driver on master. It should not affect > anyone, but I'm posting a heads-up because this is the primary SATA storage > device used on most machines and the changes made are significant enough > that they could conceivably cause an issue. So if you are using master, be > sure you have a good backup kernel in case something goes wrong. > > The easiest way to create a backup kernel is to first install the new > kernel. Your previous kernel will be stripped and copied to > /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: > > cpdup /boot/kernel.old /boot/kernel.bak > > This way you don't waste space in /boot copying kernels and modules with > full debug symbols. Having a kernel.bak is a good idea because it will > never be overwritten by an installkernel or other build target. It will > show up in the loader under 'b' for backup kernel. > > -- > > Experimental FIS-Based Switching support has been added to the AHCI > driver's port-multiplier code. Most AHCI chipsets do not have this > capability and the few that do tend to have fairly buggy hardware, so I am > not recommending that port multipliers be used generally. However, after > years of waiting I finally got my hands on a mobo that had the feature and > I've wanted to implement it forever, so I have done so. > > The FBS feature allows the computer to queue concurrent commands to > multiple targets behind a port-multiplier. Without it, only one target can > be addressed at a time. Test results with some old HDDs I had on a shelf > can be found here. As expected, concurrent performance to multiple targets > is vastly improved. > > http://apollo.backplane.com/DFlyMisc/portmult01.txt > > However, I must again stress that AHCI-based port-multipliers tend to > fall-over when errors occur, or even if you are just hot-swapping a drive > sometimes. Using a port-multiplier is not a good solution for serious > setups. Blame Intel for intentionally screwing up the consumer AHCI spec > in order to prevent it from competing with commercial SAS controllers. > > -Matt > From karu.pruun at gmail.com Thu Dec 22 05:11:17 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Thu, 22 Dec 2016 15:11:17 +0200 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby wrote: > On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > >> then fileX will show up in /pfs/slave but not in /home/slave. It will >> show up in the latter when I stop streaming, unmount the slave and mount it >> again. Is this the intended behavior? >> > > Yes, that's how it works. You won't see the updates to the mounted slave > until you unmount and re-mount. > Thanks! Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent.delft at gmail.com Mon Dec 26 03:30:12 2016 From: vincent.delft at gmail.com (vincent delft) Date: Mon, 26 Dec 2016 12:30:12 +0100 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP Message-ID: Hello, First of all I'm new to DF. Thanks for the nice install process. It's a quite easy to have DF running. But my concerns are more linked to HammerFS. My context is the following: - I would like to have a NAS system running on a small machine: 4GB RAM having a celeron CPU of 1.6GHz. - This NAS host +- 700GB of data - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB outside the NAS for DRP reasons. - I'm looking for a solution tackling the bit-rot problem. Concerning HammerFS, I've still read lot of DF manuals, pages, ... : http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html https://www.dragonflybsd.org/hammer/ So basically, I've understood that I must install DF on a small SSD disk and will dedicated the 1TB disk to HammerFS. But I have some questions: - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 slices on the disk? - For DRP reasons, how can I perform a backup on an external disk ? (the disk is only connected 1x per month on the NAS). Should I do a "dd" ? should I use the master-slave concepts ? Should I use cpdup ? Many thanks for your replies. Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From kusumi.tomohiro at gmail.com Mon Dec 26 04:15:08 2016 From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi) Date: Mon, 26 Dec 2016 21:15:08 +0900 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: In short, hammer doesn't solve (or fix) what you call bit-rot. 2016-12-26 20:30 GMT+09:00 vincent delft : > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk and > will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 > or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? > Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 > slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? should > I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > > From ipc at peercorpstrust.org Mon Dec 26 05:06:24 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 26 Dec 2016 15:06:24 +0200 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org> Hi Vincent, Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine. Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on. HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication. Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company: https://bsdmag.org/siju_george/ HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs. There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/ In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also. If any of the above is in error, I imagine someone with deeper insight will chime in. Mike On 12/26/2016 01:30 PM, vincent delft wrote: > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk > and will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be > Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 > filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or > should I create 2 slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? > should I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > From dillon at backplane.com Wed Dec 28 14:32:55 2016 From: dillon at backplane.com (Matthew Dillon) Date: Wed, 28 Dec 2016 14:32:55 -0800 Subject: Swap changes in master, enforcement of RLIMIT_RSS Message-ID: The maximum supported swap space has been significantly increased in master. The amount of core memory required to manage swap is approximately 1:128 (ram:swap-used). The system allows up to half of your core ram to be used for swap management so the maximum swap you can actually use on a machine would be as follows: With 1GB of ram it would be 512M*128 = 64GB of swap. With 8GB of ram it would be 4G*128 = 512GB of swap. With 128GB of ram it would be around ~8TB of swap. The maximum the system supports due to KVM limitations is around ~32TB of swap. While you can configure more swap than the above limitations, the system might not be able to actually manage as much as you configure. I'm considering improving the management code to reduce its memory requirements. Also, people shouldn't go overboard w/regards to configuring swap space. Configured swap does eat some memory resources even when not actively in-use, and there might not be a practical use for having such massive amounts of swap with your setup. -- The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced on a per-process basis. By default this limit is infinity so nothing will change for most people. Setting a memoryuse limit has advantages in certain circumstances as long as it isn't over-used. For example, it might be reasonable to start your browser with a 500m (500 megabyte) limit. The system will allow each process VSZ to grow larger, but will force the RSS to the limit and devalue the pages it removes to make them more likely to be recycled by the system. There is a sysctl, vm.pageout_memuse_mode, to control the behavior. It can be set to 0, 1, or 2, and defaults to 1. Setting it to 0 disables the enforcement (system operates the same as it did before the commit). The default value of 1 enables enforcement by removing pages from the process pmap and deactivating them passively. The value 2 enables enforcement and actively pages data out and frees pages. Mode 2 can cause excessive paging and wear out your SSD so it is not currently recommended. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Thu Dec 1 08:27:40 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Thu, 1 Dec 2016 09:27:40 -0700 Subject: Notice to Golang users Message-ID: If you use the Go compiler with Dragonfly (it's in dports), be aware that the upcoming release of Go 1.8 will require DF 4.4.4 or later: https://beta.golang.org/doc/go1.8 ?The Go team had put a workaround in place for DF for a thread signal handling bug. This was recently fixed by Matt, so they've removed the workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you can use Go 1.7 or earlier. Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From venture37 at geeklan.co.uk Sun Dec 4 07:56:48 2016 From: venture37 at geeklan.co.uk (Sevan Janiyan) Date: Sun, 4 Dec 2016 15:56:48 +0000 Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?= =?UTF-8?Q?icipation_extended?= Message-ID: Hi, The BSD devroom move the submission deadline to Dec 10th. There's still time to submit a talk ! Regards, - rodrigo On behalf of the BSD devroom Fosdem 2017 BSD devroom Call for Participation ============================================== Important dates -------------------- * Conference date : 4 & 5 February 2017 in Brussels, Belgium * Devroom date : Sunday 5 February 2016 * Submission deadline : Sunday 27 November 2016 * Speaker notified : Sunday 4 December 2016 The topic of the devroom includes all BSD operating systems and every talk is welcome from hacker discussions to real-word examples and presentations about new and shiny features. Practical -------------------- * The default duration for talks will be 45 minutes including discussion. Feel free to ask if you want to have a longer or a shorter slot. * Presentations can be recorded and streamed, sending your proposal implies giving permission to be recorded. However, exceptions can be made for exceptional circumstances. To submit your proposal, visit : https://penta.fosdem.org/submission/FOSDEM17 Account creation ---------------------- If you already have a Pentabarf account, please *don't* recreate a new one. If you forgot your password, reset it. If not, follow the instructions to create an account. Submit a talk ---------------------- Create an ?event? and click on "Show all" in the top right corner to display the full form. Your submission must include the following information * The title and subtitle of your talk (please be descriptive, as titles will be listed with ~500 from other projects) * select ?BSD devroom? as the track. * A short abstract of one paragraph * A longer description if you wish to do so * Links to related websites/blogs etc. - Rodrigo Osorio On behalf of the BSD Devroom From ipc at peercorpstrust.org Sat Dec 10 03:19:14 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 13:19:14 +0200 Subject: Parallel compression not using all available CPU Message-ID: Hi, I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does. When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top. This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET 2016. Has anyone else possibly observed this? -- Mike From justin at shiningsilence.com Sat Dec 10 12:26:32 2016 From: justin at shiningsilence.com (Justin Sherrill) Date: Sat, 10 Dec 2016 15:26:32 -0500 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: On the two DragonFly systems, was it Hammer or UFS? I would be surprised if that made a difference, but it might? On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund wrote: > Hi, > > I've observed that parallel compression tools such as pixz and lbzip2 do not > make use of all of the available CPU under Dragonfly. On other OSes, it > does. > > When testing on a 50 gb file, using top I've observed that CPU idle > percentages consistently hover around the 90% range for pixz and ~70% for > lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until > compression is complete. Correspondingly, compression takes significantly > longer under Dragonfly, so the CPU is really being under utilized in this > case as opposed to erroneous reporting by top. > > This was tested on two systems, one 16c/32t and a 2c/2t system on a recent > master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET > 2016. > > Has anyone else possibly observed this? > > -- > Mike > From ipc at peercorpstrust.org Sat Dec 10 13:14:26 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 23:14:26 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Hi, On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. On 12/10/2016 10:26 PM, Justin Sherrill wrote: > On the two DragonFly systems, was it Hammer or UFS? I would be > surprised if that made a difference, but it might? > > On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund > wrote: >> Hi, >> >> I've observed that parallel compression tools such as pixz and lbzip2 do not >> make use of all of the available CPU under Dragonfly. On other OSes, it >> does. >> >> When testing on a 50 gb file, using top I've observed that CPU idle >> percentages consistently hover around the 90% range for pixz and ~70% for >> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >> compression is complete. Correspondingly, compression takes significantly >> longer under Dragonfly, so the CPU is really being under utilized in this >> case as opposed to erroneous reporting by top. >> >> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >> 2016. >> >> Has anyone else possibly observed this? >> >> -- >> Mike >> > From dillon at apollo.backplane.com Sat Dec 10 13:29:24 2016 From: dillon at apollo.backplane.com (Matthew Dillon) Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST) Subject: Parallel compression not using all available CPU References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com> :Hi, : :On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. Generally speaking I wouldn't want these utilities to use all available cpu threads by default. It's not usually a good idea for a program to just assume that all the cpus are there for it (by default). But they should have command line options that allow the number of threads to be specified. -Matt Matthew Dillon From jasse at yberwaffe.com Sat Dec 10 16:00:02 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Sun, 11 Dec 2016 01:00:02 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Have you tried to disable hypertreads in the BIOS ??? It's a long shot, I know, but it might help. On 2016-12-10 22:14, PeerCorps Trust Fund wrote: > Hi, > > On both systems HAMMER was used. One small correction concerning the > 2c/2t machine, both compression programs did effectively utilize that > CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t > where the CPU isn't effectively maxed out. I'll continue to try and > investigate why and report back if I find anything. > > > On 12/10/2016 10:26 PM, Justin Sherrill wrote: >> On the two DragonFly systems, was it Hammer or UFS? I would be >> surprised if that made a difference, but it might? >> >> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >> wrote: >>> Hi, >>> >>> I've observed that parallel compression tools such as pixz and >>> lbzip2 do not >>> make use of all of the available CPU under Dragonfly. On other OSes, it >>> does. >>> >>> When testing on a 50 gb file, using top I've observed that CPU idle >>> percentages consistently hover around the 90% range for pixz and >>> ~70% for >>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>> idle until >>> compression is complete. Correspondingly, compression takes >>> significantly >>> longer under Dragonfly, so the CPU is really being under utilized in >>> this >>> case as opposed to erroneous reporting by top. >>> >>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>> recent >>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>> 11:44:04 EET >>> 2016. >>> >>> Has anyone else possibly observed this? >>> >>> -- >>> Mike >>> >> > From ipc at peercorpstrust.org Sun Dec 11 13:22:44 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sun, 11 Dec 2016 23:22:44 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Hi, It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk. I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU: ######################################################################## "bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache. Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2. Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time. Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB." ######################################################################## Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%). Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made. On 12/11/2016 02:00 AM, Jasse Jansson wrote: > Have you tried to disable hypertreads in the BIOS ??? > It's a long shot, I know, but it might help. > > On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> Hi, >> >> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. >> >> >> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> On the two DragonFly systems, was it Hammer or UFS? I would be >>> surprised if that made a difference, but it might? >>> >>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>> wrote: >>>> Hi, >>>> >>>> I've observed that parallel compression tools such as pixz and lbzip2 do not >>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>> does. >>>> >>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>> percentages consistently hover around the 90% range for pixz and ~70% for >>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >>>> compression is complete. Correspondingly, compression takes significantly >>>> longer under Dragonfly, so the CPU is really being under utilized in this >>>> case as opposed to erroneous reporting by top. >>>> >>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >>>> 2016. >>>> >>>> Has anyone else possibly observed this? >>>> >>>> -- >>>> Mike >>>> >>> >> > > From dillon at backplane.com Sun Dec 11 15:37:36 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sun, 11 Dec 2016 15:37:36 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: That doesn't make any sense. It sounds like it is just compressing more slowly, so there is less idle time because the HDD/SSD is able to keep up due to it compressing more slowly. You don't want to turn off hyperthreading in the BIOS and cache coherency stalls will not show up in the idle% anyway. -Matt On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < ipc at peercorpstrust.org> wrote: > Hi, > > It turns out that it was a combination of two things - turning off > hyperthreading in BIOS and using a faster disk. > > I found a post from the author of lbzip2 which seems to describe what > might be happening in this case, but reference was made to a user using an > i5 mobile CPU: > > ######################################################################## > > "bzip2 author here. I strongly suspect that you see what you see because > your Intel core i5 is probably only dual core PLUS hyper-threaded, not real > quad-core. Meaning, you have two instances of the L2 per-core cache, not > four, and each two hyperthreads share an L2 cache. > > Since the bzip2 compression/decompression is very cache sensitive (see > "man bzip2"), the scaling factor will be determined mostly by how many > OS-threads can dispose over a dedicated cache each. In your case this > number is probably 2. > > Since you run two threads per core, those contend for the shared L2 cache, > basically each messing with the other (flushing / invalidating the shared > cache for the other). This contention shows up as double CPU time, because > "waiting for cache" (or "waiting for main memory") is accounted for as CPU > time. > > Hyperthreading is not useful but detrimental for lbzip2; so you should > export LBZIP2="-n 2". You should not run more worker threads per core than: > core-dedicated-cache-size divided by 8MB." > > ######################################################################## > > Running the compression again on the same file from an SSD with > hyperthreading turned off, I was able to fully saturate all of the cores > using lbzip2. None of this seemed obvious at first, but it rectified the > situation. The biggest difference came from turning off hyperthreading > (idle CPU - 20% vs the previous 90%) and then running from an SSD with > hyperthreading turned off (idle CPU = 0%). > > Previously, the compression was run from a single HDD, not an SSD. > Concerning the compression test using the same HDD under FreeBSD, well I > don't know why it was able to saturate the CPU. Perhaps it has something to > do with ZFS's aggressive caching. Turning that off and re-running the test > would likely answer the question. Pixz performed similarly when the above > two modifications were made. > > > > > On 12/11/2016 02:00 AM, Jasse Jansson wrote: > >> Have you tried to disable hypertreads in the BIOS ??? >> It's a long shot, I know, but it might help. >> >> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> >>> Hi, >>> >>> On both systems HAMMER was used. One small correction concerning the >>> 2c/2t machine, both compression programs did effectively utilize that CPU >>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>> isn't effectively maxed out. I'll continue to try and investigate why and >>> report back if I find anything. >>> >>> >>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> >>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>> surprised if that made a difference, but it might? >>>> >>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>> do not >>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>> does. >>>>> >>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>> for >>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>> until >>>>> compression is complete. Correspondingly, compression takes >>>>> significantly >>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>> this >>>>> case as opposed to erroneous reporting by top. >>>>> >>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>> recent >>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>> EET >>>>> 2016. >>>>> >>>>> Has anyone else possibly observed this? >>>>> >>>>> -- >>>>> Mike >>>>> >>>>> >>>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Sun Dec 11 22:14:29 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 12 Dec 2016 08:14:29 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Thanks for this. Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on? On 12/12/2016 01:37 AM, Matthew Dillon wrote: > That doesn't make any sense. It sounds like it is just compressing more > slowly, so there is less idle time because the HDD/SSD is able to keep up > due to it compressing more slowly. You don't want to turn off > hyperthreading in the BIOS and cache coherency stalls will not show up in > the idle% anyway. > > -Matt > > On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < > ipc at peercorpstrust.org> wrote: > >> Hi, >> >> It turns out that it was a combination of two things - turning off >> hyperthreading in BIOS and using a faster disk. >> >> I found a post from the author of lbzip2 which seems to describe what >> might be happening in this case, but reference was made to a user using an >> i5 mobile CPU: >> >> ######################################################################## >> >> "bzip2 author here. I strongly suspect that you see what you see because >> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real >> quad-core. Meaning, you have two instances of the L2 per-core cache, not >> four, and each two hyperthreads share an L2 cache. >> >> Since the bzip2 compression/decompression is very cache sensitive (see >> "man bzip2"), the scaling factor will be determined mostly by how many >> OS-threads can dispose over a dedicated cache each. In your case this >> number is probably 2. >> >> Since you run two threads per core, those contend for the shared L2 cache, >> basically each messing with the other (flushing / invalidating the shared >> cache for the other). This contention shows up as double CPU time, because >> "waiting for cache" (or "waiting for main memory") is accounted for as CPU >> time. >> >> Hyperthreading is not useful but detrimental for lbzip2; so you should >> export LBZIP2="-n 2". You should not run more worker threads per core than: >> core-dedicated-cache-size divided by 8MB." >> >> ######################################################################## >> >> Running the compression again on the same file from an SSD with >> hyperthreading turned off, I was able to fully saturate all of the cores >> using lbzip2. None of this seemed obvious at first, but it rectified the >> situation. The biggest difference came from turning off hyperthreading >> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >> hyperthreading turned off (idle CPU = 0%). >> >> Previously, the compression was run from a single HDD, not an SSD. >> Concerning the compression test using the same HDD under FreeBSD, well I >> don't know why it was able to saturate the CPU. Perhaps it has something to >> do with ZFS's aggressive caching. Turning that off and re-running the test >> would likely answer the question. Pixz performed similarly when the above >> two modifications were made. >> >> >> >> >> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >> >>> Have you tried to disable hypertreads in the BIOS ??? >>> It's a long shot, I know, but it might help. >>> >>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>> >>>> Hi, >>>> >>>> On both systems HAMMER was used. One small correction concerning the >>>> 2c/2t machine, both compression programs did effectively utilize that CPU >>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>>> isn't effectively maxed out. I'll continue to try and investigate why and >>>> report back if I find anything. >>>> >>>> >>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>> >>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>> surprised if that made a difference, but it might? >>>>> >>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>>> do not >>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>>> does. >>>>>> >>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>>> for >>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>>> until >>>>>> compression is complete. Correspondingly, compression takes >>>>>> significantly >>>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>>> this >>>>>> case as opposed to erroneous reporting by top. >>>>>> >>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>> recent >>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>>> EET >>>>>> 2016. >>>>>> >>>>>> Has anyone else possibly observed this? >>>>>> >>>>>> -- >>>>>> Mike >>>>>> >>>>>> >>>>> >>>> >>> >>> > From jasse at yberwaffe.com Mon Dec 12 05:41:40 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Mon, 12 Dec 2016 14:41:40 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> They recommend to turn hyperthreading off if you run studio software on your computer. That's if you run Windows, I have no idea if HT affects a Unix derivative anyway. On 2016-12-12 07:14, PeerCorps Trust Fund wrote: > Thanks for this. > > Is there ever a side case where hyperthreading might have > unpredictable results or should it generally always be left on? > > On 12/12/2016 01:37 AM, Matthew Dillon wrote: >> That doesn't make any sense. It sounds like it is just compressing more >> slowly, so there is less idle time because the HDD/SSD is able to >> keep up >> due to it compressing more slowly. You don't want to turn off >> hyperthreading in the BIOS and cache coherency stalls will not show >> up in >> the idle% anyway. >> >> -Matt >> >> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < >> ipc at peercorpstrust.org> wrote: >> >>> Hi, >>> >>> It turns out that it was a combination of two things - turning off >>> hyperthreading in BIOS and using a faster disk. >>> >>> I found a post from the author of lbzip2 which seems to describe what >>> might be happening in this case, but reference was made to a user >>> using an >>> i5 mobile CPU: >>> >>> ######################################################################## >>> >>> >>> "bzip2 author here. I strongly suspect that you see what you see >>> because >>> your Intel core i5 is probably only dual core PLUS hyper-threaded, >>> not real >>> quad-core. Meaning, you have two instances of the L2 per-core cache, >>> not >>> four, and each two hyperthreads share an L2 cache. >>> >>> Since the bzip2 compression/decompression is very cache sensitive (see >>> "man bzip2"), the scaling factor will be determined mostly by how many >>> OS-threads can dispose over a dedicated cache each. In your case this >>> number is probably 2. >>> >>> Since you run two threads per core, those contend for the shared L2 >>> cache, >>> basically each messing with the other (flushing / invalidating the >>> shared >>> cache for the other). This contention shows up as double CPU time, >>> because >>> "waiting for cache" (or "waiting for main memory") is accounted for >>> as CPU >>> time. >>> >>> Hyperthreading is not useful but detrimental for lbzip2; so you should >>> export LBZIP2="-n 2". You should not run more worker threads per >>> core than: >>> core-dedicated-cache-size divided by 8MB." >>> >>> ######################################################################## >>> >>> >>> Running the compression again on the same file from an SSD with >>> hyperthreading turned off, I was able to fully saturate all of the >>> cores >>> using lbzip2. None of this seemed obvious at first, but it rectified >>> the >>> situation. The biggest difference came from turning off hyperthreading >>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >>> hyperthreading turned off (idle CPU = 0%). >>> >>> Previously, the compression was run from a single HDD, not an SSD. >>> Concerning the compression test using the same HDD under FreeBSD, >>> well I >>> don't know why it was able to saturate the CPU. Perhaps it has >>> something to >>> do with ZFS's aggressive caching. Turning that off and re-running >>> the test >>> would likely answer the question. Pixz performed similarly when the >>> above >>> two modifications were made. >>> >>> >>> >>> >>> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >>> >>>> Have you tried to disable hypertreads in the BIOS ??? >>>> It's a long shot, I know, but it might help. >>>> >>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>>> >>>>> Hi, >>>>> >>>>> On both systems HAMMER was used. One small correction concerning the >>>>> 2c/2t machine, both compression programs did effectively utilize >>>>> that CPU >>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t >>>>> where the CPU >>>>> isn't effectively maxed out. I'll continue to try and investigate >>>>> why and >>>>> report back if I find anything. >>>>> >>>>> >>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>>> >>>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>>> surprised if that made a difference, but it might? >>>>>> >>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I've observed that parallel compression tools such as pixz and >>>>>>> lbzip2 >>>>>>> do not >>>>>>> make use of all of the available CPU under Dragonfly. On other >>>>>>> OSes, it >>>>>>> does. >>>>>>> >>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>>> percentages consistently hover around the 90% range for pixz and >>>>>>> ~70% >>>>>>> for >>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>>>>>> idle >>>>>>> until >>>>>>> compression is complete. Correspondingly, compression takes >>>>>>> significantly >>>>>>> longer under Dragonfly, so the CPU is really being under >>>>>>> utilized in >>>>>>> this >>>>>>> case as opposed to erroneous reporting by top. >>>>>>> >>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>>> recent >>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>>>>>> 11:44:04 >>>>>>> EET >>>>>>> 2016. >>>>>>> >>>>>>> Has anyone else possibly observed this? >>>>>>> >>>>>>> -- >>>>>>> Mike >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >> > From dillon at backplane.com Tue Dec 13 12:56:41 2016 From: dillon at backplane.com (Matthew Dillon) Date: Tue, 13 Dec 2016 12:56:41 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> Message-ID: Never turn hyper-threading off if running (any) BSD or Linux. The schedulers understand hyperthreads. In terms of compression programs, presumably they have an option to specify how many threads to create to do the compression. Simply test with 1, 2, and 4 threads to determine the optimal number. -Matt ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:10:35 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:10:35 -0800 Subject: Graphics-related memory leak fixed in master for post-haswell machines Message-ID: A memory leak has been fixed in master's drm (gpu) code which affects post-haswell intel GPUs. i.e. Broadwell, Skylake, etc. This leak would cause a OOM panic typically after a couple of weeks of uptime. The leak does not affect older Haswell based intel systems. A number of other stability improvements have also gone into the system. The release branch got some of the stability work but the fix for the graphics memory leak is a bit too complex to backport in a compatible manner. At the current juncture, anyone running a workstation with release might want to switch to master. Master will be *much* better on broadwell or later GPUs. Also somewhat related, chromium (chrome) has been updated from v52 to v54 on both branches, which fixes some serious GUI issues on high-dpi (aka 4K) monitors. Chrome 54 still has some issues with certain sites, but generally speaking it seems to be pretty stable and I recommend its use. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:19:31 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:19:31 -0800 Subject: Setting up a more secure browser environment Message-ID: I have created a new page in the handbook describing how to set up a secure browser environment using isolation accounts. We recommend running your browser this way instead of running it directly from your main workstation account. https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/ Currently chrome is considered to be the most secure. Now that chrome 54 is in dports, it should generate a pretty good experience for users. There are some sites which still implode on it (and in FreeBSD too, that isn't a DFly-specific problem), and for those you may have to run firefox. My personal preference is chrome with some adjustments to the setup. For people running chrome on a 4K monitor, I recommend running chrome with the "--force-device-scale-factor=1.5" option before you start messing with font or other settings. It will scale the UI a little closer to what you probably want. DragonFly master's support through Intel Skylake is now very good. I am running a Skylake box as my workstation now and it has been very stable. Theoretically we have some Karby support in there too but no guarantees on stability. Radeon support has also seen some improvement though not at the same pace as Intel GPU support has seen. At this point, DragonFly runs extremely well even on mobile chipsets (though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in terms of performance). I recommend a minimum of 4GB of ram if using X. DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or NVMe SSD and a large swap partition to page to for decent browser performance. Browsers are still big memory hogs. Eventually we will have chrome 55 support which is supposed to eat much less memory, but not yet. chrome 54 is where we are now. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From karu.pruun at gmail.com Mon Dec 19 05:21:56 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Mon, 19 Dec 2016 15:21:56 +0200 Subject: question about hammer mirror-stream Message-ID: Hello A quick question regarding hammer mirror-streaming. The master and slave are mounted as # mount_null /pfs/master /home/master # mount_null /pfs/slave /home/slave and streaming is ongoing with # hammer mirror-stream /home/master /home/slave When there's a change in master, e.g. % sudo touch /home/master/fileX then fileX will show up in /pfs/slave but not in /home/slave. It will show up in the latter when I stop streaming, unmount the slave and mount it again. Is this the intended behavior? Thanks Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Mon Dec 19 08:50:06 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Mon, 19 Dec 2016 09:50:06 -0700 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > then fileX will show up in /pfs/slave but not in /home/slave. It will show > up in the latter when I stop streaming, unmount the slave and mount it > again. Is this the intended behavior? > Yes, that's how it works. You won't see the updates to the mounted slave until you unmount and re-mount. Tim? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Mon Dec 19 10:35:29 2016 From: dillon at backplane.com (Matthew Dillon) Date: Mon, 19 Dec 2016 10:35:29 -0800 Subject: AHCI driver on master - heads up Message-ID: Some work has gone into the AHCI driver on master. It should not affect anyone, but I'm posting a heads-up because this is the primary SATA storage device used on most machines and the changes made are significant enough that they could conceivably cause an issue. So if you are using master, be sure you have a good backup kernel in case something goes wrong. The easiest way to create a backup kernel is to first install the new kernel. Your previous kernel will be stripped and copied to /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: cpdup /boot/kernel.old /boot/kernel.bak This way you don't waste space in /boot copying kernels and modules with full debug symbols. Having a kernel.bak is a good idea because it will never be overwritten by an installkernel or other build target. It will show up in the loader under 'b' for backup kernel. -- Experimental FIS-Based Switching support has been added to the AHCI driver's port-multiplier code. Most AHCI chipsets do not have this capability and the few that do tend to have fairly buggy hardware, so I am not recommending that port multipliers be used generally. However, after years of waiting I finally got my hands on a mobo that had the feature and I've wanted to implement it forever, so I have done so. The FBS feature allows the computer to queue concurrent commands to multiple targets behind a port-multiplier. Without it, only one target can be addressed at a time. Test results with some old HDDs I had on a shelf can be found here. As expected, concurrent performance to multiple targets is vastly improved. http://apollo.backplane.com/DFlyMisc/portmult01.txt However, I must again stress that AHCI-based port-multipliers tend to fall-over when errors occur, or even if you are just hot-swapping a drive sometimes. Using a port-multiplier is not a good solution for serious setups. Blame Intel for intentionally screwing up the consumer AHCI spec in order to prevent it from competing with commercial SAS controllers. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Mon Dec 19 11:14:07 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 19 Dec 2016 21:14:07 +0200 Subject: AHCI driver on master - heads up In-Reply-To: References: Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org> This is interesting. AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you! On 12/19/2016 08:35 PM, Matthew Dillon wrote: > Some work has gone into the AHCI driver on master. It should not affect > anyone, but I'm posting a heads-up because this is the primary SATA storage > device used on most machines and the changes made are significant enough > that they could conceivably cause an issue. So if you are using master, be > sure you have a good backup kernel in case something goes wrong. > > The easiest way to create a backup kernel is to first install the new > kernel. Your previous kernel will be stripped and copied to > /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: > > cpdup /boot/kernel.old /boot/kernel.bak > > This way you don't waste space in /boot copying kernels and modules with > full debug symbols. Having a kernel.bak is a good idea because it will > never be overwritten by an installkernel or other build target. It will > show up in the loader under 'b' for backup kernel. > > -- > > Experimental FIS-Based Switching support has been added to the AHCI > driver's port-multiplier code. Most AHCI chipsets do not have this > capability and the few that do tend to have fairly buggy hardware, so I am > not recommending that port multipliers be used generally. However, after > years of waiting I finally got my hands on a mobo that had the feature and > I've wanted to implement it forever, so I have done so. > > The FBS feature allows the computer to queue concurrent commands to > multiple targets behind a port-multiplier. Without it, only one target can > be addressed at a time. Test results with some old HDDs I had on a shelf > can be found here. As expected, concurrent performance to multiple targets > is vastly improved. > > http://apollo.backplane.com/DFlyMisc/portmult01.txt > > However, I must again stress that AHCI-based port-multipliers tend to > fall-over when errors occur, or even if you are just hot-swapping a drive > sometimes. Using a port-multiplier is not a good solution for serious > setups. Blame Intel for intentionally screwing up the consumer AHCI spec > in order to prevent it from competing with commercial SAS controllers. > > -Matt > From karu.pruun at gmail.com Thu Dec 22 05:11:17 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Thu, 22 Dec 2016 15:11:17 +0200 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby wrote: > On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > >> then fileX will show up in /pfs/slave but not in /home/slave. It will >> show up in the latter when I stop streaming, unmount the slave and mount it >> again. Is this the intended behavior? >> > > Yes, that's how it works. You won't see the updates to the mounted slave > until you unmount and re-mount. > Thanks! Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent.delft at gmail.com Mon Dec 26 03:30:12 2016 From: vincent.delft at gmail.com (vincent delft) Date: Mon, 26 Dec 2016 12:30:12 +0100 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP Message-ID: Hello, First of all I'm new to DF. Thanks for the nice install process. It's a quite easy to have DF running. But my concerns are more linked to HammerFS. My context is the following: - I would like to have a NAS system running on a small machine: 4GB RAM having a celeron CPU of 1.6GHz. - This NAS host +- 700GB of data - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB outside the NAS for DRP reasons. - I'm looking for a solution tackling the bit-rot problem. Concerning HammerFS, I've still read lot of DF manuals, pages, ... : http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html https://www.dragonflybsd.org/hammer/ So basically, I've understood that I must install DF on a small SSD disk and will dedicated the 1TB disk to HammerFS. But I have some questions: - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 slices on the disk? - For DRP reasons, how can I perform a backup on an external disk ? (the disk is only connected 1x per month on the NAS). Should I do a "dd" ? should I use the master-slave concepts ? Should I use cpdup ? Many thanks for your replies. Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From kusumi.tomohiro at gmail.com Mon Dec 26 04:15:08 2016 From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi) Date: Mon, 26 Dec 2016 21:15:08 +0900 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: In short, hammer doesn't solve (or fix) what you call bit-rot. 2016-12-26 20:30 GMT+09:00 vincent delft : > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk and > will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 > or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? > Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 > slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? should > I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > > From ipc at peercorpstrust.org Mon Dec 26 05:06:24 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 26 Dec 2016 15:06:24 +0200 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org> Hi Vincent, Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine. Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on. HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication. Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company: https://bsdmag.org/siju_george/ HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs. There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/ In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also. If any of the above is in error, I imagine someone with deeper insight will chime in. Mike On 12/26/2016 01:30 PM, vincent delft wrote: > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk > and will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be > Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 > filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or > should I create 2 slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? > should I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > From dillon at backplane.com Wed Dec 28 14:32:55 2016 From: dillon at backplane.com (Matthew Dillon) Date: Wed, 28 Dec 2016 14:32:55 -0800 Subject: Swap changes in master, enforcement of RLIMIT_RSS Message-ID: The maximum supported swap space has been significantly increased in master. The amount of core memory required to manage swap is approximately 1:128 (ram:swap-used). The system allows up to half of your core ram to be used for swap management so the maximum swap you can actually use on a machine would be as follows: With 1GB of ram it would be 512M*128 = 64GB of swap. With 8GB of ram it would be 4G*128 = 512GB of swap. With 128GB of ram it would be around ~8TB of swap. The maximum the system supports due to KVM limitations is around ~32TB of swap. While you can configure more swap than the above limitations, the system might not be able to actually manage as much as you configure. I'm considering improving the management code to reduce its memory requirements. Also, people shouldn't go overboard w/regards to configuring swap space. Configured swap does eat some memory resources even when not actively in-use, and there might not be a practical use for having such massive amounts of swap with your setup. -- The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced on a per-process basis. By default this limit is infinity so nothing will change for most people. Setting a memoryuse limit has advantages in certain circumstances as long as it isn't over-used. For example, it might be reasonable to start your browser with a 500m (500 megabyte) limit. The system will allow each process VSZ to grow larger, but will force the RSS to the limit and devalue the pages it removes to make them more likely to be recycled by the system. There is a sysctl, vm.pageout_memuse_mode, to control the behavior. It can be set to 0, 1, or 2, and defaults to 1. Setting it to 0 disables the enforcement (system operates the same as it did before the commit). The default value of 1 enables enforcement by removing pages from the process pmap and deactivating them passively. The value 2 enables enforcement and actively pages data out and frees pages. Mode 2 can cause excessive paging and wear out your SSD so it is not currently recommended. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Thu Dec 1 08:27:40 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Thu, 1 Dec 2016 09:27:40 -0700 Subject: Notice to Golang users Message-ID: If you use the Go compiler with Dragonfly (it's in dports), be aware that the upcoming release of Go 1.8 will require DF 4.4.4 or later: https://beta.golang.org/doc/go1.8 ?The Go team had put a workaround in place for DF for a thread signal handling bug. This was recently fixed by Matt, so they've removed the workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you can use Go 1.7 or earlier. Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From venture37 at geeklan.co.uk Sun Dec 4 07:56:48 2016 From: venture37 at geeklan.co.uk (Sevan Janiyan) Date: Sun, 4 Dec 2016 15:56:48 +0000 Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?= =?UTF-8?Q?icipation_extended?= Message-ID: Hi, The BSD devroom move the submission deadline to Dec 10th. There's still time to submit a talk ! Regards, - rodrigo On behalf of the BSD devroom Fosdem 2017 BSD devroom Call for Participation ============================================== Important dates -------------------- * Conference date : 4 & 5 February 2017 in Brussels, Belgium * Devroom date : Sunday 5 February 2016 * Submission deadline : Sunday 27 November 2016 * Speaker notified : Sunday 4 December 2016 The topic of the devroom includes all BSD operating systems and every talk is welcome from hacker discussions to real-word examples and presentations about new and shiny features. Practical -------------------- * The default duration for talks will be 45 minutes including discussion. Feel free to ask if you want to have a longer or a shorter slot. * Presentations can be recorded and streamed, sending your proposal implies giving permission to be recorded. However, exceptions can be made for exceptional circumstances. To submit your proposal, visit : https://penta.fosdem.org/submission/FOSDEM17 Account creation ---------------------- If you already have a Pentabarf account, please *don't* recreate a new one. If you forgot your password, reset it. If not, follow the instructions to create an account. Submit a talk ---------------------- Create an ?event? and click on "Show all" in the top right corner to display the full form. Your submission must include the following information * The title and subtitle of your talk (please be descriptive, as titles will be listed with ~500 from other projects) * select ?BSD devroom? as the track. * A short abstract of one paragraph * A longer description if you wish to do so * Links to related websites/blogs etc. - Rodrigo Osorio On behalf of the BSD Devroom From ipc at peercorpstrust.org Sat Dec 10 03:19:14 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 13:19:14 +0200 Subject: Parallel compression not using all available CPU Message-ID: Hi, I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does. When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top. This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET 2016. Has anyone else possibly observed this? -- Mike From justin at shiningsilence.com Sat Dec 10 12:26:32 2016 From: justin at shiningsilence.com (Justin Sherrill) Date: Sat, 10 Dec 2016 15:26:32 -0500 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: On the two DragonFly systems, was it Hammer or UFS? I would be surprised if that made a difference, but it might? On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund wrote: > Hi, > > I've observed that parallel compression tools such as pixz and lbzip2 do not > make use of all of the available CPU under Dragonfly. On other OSes, it > does. > > When testing on a 50 gb file, using top I've observed that CPU idle > percentages consistently hover around the 90% range for pixz and ~70% for > lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until > compression is complete. Correspondingly, compression takes significantly > longer under Dragonfly, so the CPU is really being under utilized in this > case as opposed to erroneous reporting by top. > > This was tested on two systems, one 16c/32t and a 2c/2t system on a recent > master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET > 2016. > > Has anyone else possibly observed this? > > -- > Mike > From ipc at peercorpstrust.org Sat Dec 10 13:14:26 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sat, 10 Dec 2016 23:14:26 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Hi, On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. On 12/10/2016 10:26 PM, Justin Sherrill wrote: > On the two DragonFly systems, was it Hammer or UFS? I would be > surprised if that made a difference, but it might? > > On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund > wrote: >> Hi, >> >> I've observed that parallel compression tools such as pixz and lbzip2 do not >> make use of all of the available CPU under Dragonfly. On other OSes, it >> does. >> >> When testing on a 50 gb file, using top I've observed that CPU idle >> percentages consistently hover around the 90% range for pixz and ~70% for >> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >> compression is complete. Correspondingly, compression takes significantly >> longer under Dragonfly, so the CPU is really being under utilized in this >> case as opposed to erroneous reporting by top. >> >> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >> 2016. >> >> Has anyone else possibly observed this? >> >> -- >> Mike >> > From dillon at apollo.backplane.com Sat Dec 10 13:29:24 2016 From: dillon at apollo.backplane.com (Matthew Dillon) Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST) Subject: Parallel compression not using all available CPU References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com> :Hi, : :On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. Generally speaking I wouldn't want these utilities to use all available cpu threads by default. It's not usually a good idea for a program to just assume that all the cpus are there for it (by default). But they should have command line options that allow the number of threads to be specified. -Matt Matthew Dillon From jasse at yberwaffe.com Sat Dec 10 16:00:02 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Sun, 11 Dec 2016 01:00:02 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Have you tried to disable hypertreads in the BIOS ??? It's a long shot, I know, but it might help. On 2016-12-10 22:14, PeerCorps Trust Fund wrote: > Hi, > > On both systems HAMMER was used. One small correction concerning the > 2c/2t machine, both compression programs did effectively utilize that > CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t > where the CPU isn't effectively maxed out. I'll continue to try and > investigate why and report back if I find anything. > > > On 12/10/2016 10:26 PM, Justin Sherrill wrote: >> On the two DragonFly systems, was it Hammer or UFS? I would be >> surprised if that made a difference, but it might? >> >> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >> wrote: >>> Hi, >>> >>> I've observed that parallel compression tools such as pixz and >>> lbzip2 do not >>> make use of all of the available CPU under Dragonfly. On other OSes, it >>> does. >>> >>> When testing on a 50 gb file, using top I've observed that CPU idle >>> percentages consistently hover around the 90% range for pixz and >>> ~70% for >>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>> idle until >>> compression is complete. Correspondingly, compression takes >>> significantly >>> longer under Dragonfly, so the CPU is really being under utilized in >>> this >>> case as opposed to erroneous reporting by top. >>> >>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>> recent >>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>> 11:44:04 EET >>> 2016. >>> >>> Has anyone else possibly observed this? >>> >>> -- >>> Mike >>> >> > From ipc at peercorpstrust.org Sun Dec 11 13:22:44 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Sun, 11 Dec 2016 23:22:44 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Hi, It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk. I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU: ######################################################################## "bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache. Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2. Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time. Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB." ######################################################################## Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%). Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made. On 12/11/2016 02:00 AM, Jasse Jansson wrote: > Have you tried to disable hypertreads in the BIOS ??? > It's a long shot, I know, but it might help. > > On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> Hi, >> >> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything. >> >> >> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> On the two DragonFly systems, was it Hammer or UFS? I would be >>> surprised if that made a difference, but it might? >>> >>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>> wrote: >>>> Hi, >>>> >>>> I've observed that parallel compression tools such as pixz and lbzip2 do not >>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>> does. >>>> >>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>> percentages consistently hover around the 90% range for pixz and ~70% for >>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until >>>> compression is complete. Correspondingly, compression takes significantly >>>> longer under Dragonfly, so the CPU is really being under utilized in this >>>> case as opposed to erroneous reporting by top. >>>> >>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent >>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 EET >>>> 2016. >>>> >>>> Has anyone else possibly observed this? >>>> >>>> -- >>>> Mike >>>> >>> >> > > From dillon at backplane.com Sun Dec 11 15:37:36 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sun, 11 Dec 2016 15:37:36 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: That doesn't make any sense. It sounds like it is just compressing more slowly, so there is less idle time because the HDD/SSD is able to keep up due to it compressing more slowly. You don't want to turn off hyperthreading in the BIOS and cache coherency stalls will not show up in the idle% anyway. -Matt On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < ipc at peercorpstrust.org> wrote: > Hi, > > It turns out that it was a combination of two things - turning off > hyperthreading in BIOS and using a faster disk. > > I found a post from the author of lbzip2 which seems to describe what > might be happening in this case, but reference was made to a user using an > i5 mobile CPU: > > ######################################################################## > > "bzip2 author here. I strongly suspect that you see what you see because > your Intel core i5 is probably only dual core PLUS hyper-threaded, not real > quad-core. Meaning, you have two instances of the L2 per-core cache, not > four, and each two hyperthreads share an L2 cache. > > Since the bzip2 compression/decompression is very cache sensitive (see > "man bzip2"), the scaling factor will be determined mostly by how many > OS-threads can dispose over a dedicated cache each. In your case this > number is probably 2. > > Since you run two threads per core, those contend for the shared L2 cache, > basically each messing with the other (flushing / invalidating the shared > cache for the other). This contention shows up as double CPU time, because > "waiting for cache" (or "waiting for main memory") is accounted for as CPU > time. > > Hyperthreading is not useful but detrimental for lbzip2; so you should > export LBZIP2="-n 2". You should not run more worker threads per core than: > core-dedicated-cache-size divided by 8MB." > > ######################################################################## > > Running the compression again on the same file from an SSD with > hyperthreading turned off, I was able to fully saturate all of the cores > using lbzip2. None of this seemed obvious at first, but it rectified the > situation. The biggest difference came from turning off hyperthreading > (idle CPU - 20% vs the previous 90%) and then running from an SSD with > hyperthreading turned off (idle CPU = 0%). > > Previously, the compression was run from a single HDD, not an SSD. > Concerning the compression test using the same HDD under FreeBSD, well I > don't know why it was able to saturate the CPU. Perhaps it has something to > do with ZFS's aggressive caching. Turning that off and re-running the test > would likely answer the question. Pixz performed similarly when the above > two modifications were made. > > > > > On 12/11/2016 02:00 AM, Jasse Jansson wrote: > >> Have you tried to disable hypertreads in the BIOS ??? >> It's a long shot, I know, but it might help. >> >> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >> >>> Hi, >>> >>> On both systems HAMMER was used. One small correction concerning the >>> 2c/2t machine, both compression programs did effectively utilize that CPU >>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>> isn't effectively maxed out. I'll continue to try and investigate why and >>> report back if I find anything. >>> >>> >>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>> >>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>> surprised if that made a difference, but it might? >>>> >>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>> do not >>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>> does. >>>>> >>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>> for >>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>> until >>>>> compression is complete. Correspondingly, compression takes >>>>> significantly >>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>> this >>>>> case as opposed to erroneous reporting by top. >>>>> >>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>> recent >>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>> EET >>>>> 2016. >>>>> >>>>> Has anyone else possibly observed this? >>>>> >>>>> -- >>>>> Mike >>>>> >>>>> >>>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Sun Dec 11 22:14:29 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 12 Dec 2016 08:14:29 +0200 Subject: Parallel compression not using all available CPU In-Reply-To: References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Thanks for this. Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on? On 12/12/2016 01:37 AM, Matthew Dillon wrote: > That doesn't make any sense. It sounds like it is just compressing more > slowly, so there is less idle time because the HDD/SSD is able to keep up > due to it compressing more slowly. You don't want to turn off > hyperthreading in the BIOS and cache coherency stalls will not show up in > the idle% anyway. > > -Matt > > On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < > ipc at peercorpstrust.org> wrote: > >> Hi, >> >> It turns out that it was a combination of two things - turning off >> hyperthreading in BIOS and using a faster disk. >> >> I found a post from the author of lbzip2 which seems to describe what >> might be happening in this case, but reference was made to a user using an >> i5 mobile CPU: >> >> ######################################################################## >> >> "bzip2 author here. I strongly suspect that you see what you see because >> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real >> quad-core. Meaning, you have two instances of the L2 per-core cache, not >> four, and each two hyperthreads share an L2 cache. >> >> Since the bzip2 compression/decompression is very cache sensitive (see >> "man bzip2"), the scaling factor will be determined mostly by how many >> OS-threads can dispose over a dedicated cache each. In your case this >> number is probably 2. >> >> Since you run two threads per core, those contend for the shared L2 cache, >> basically each messing with the other (flushing / invalidating the shared >> cache for the other). This contention shows up as double CPU time, because >> "waiting for cache" (or "waiting for main memory") is accounted for as CPU >> time. >> >> Hyperthreading is not useful but detrimental for lbzip2; so you should >> export LBZIP2="-n 2". You should not run more worker threads per core than: >> core-dedicated-cache-size divided by 8MB." >> >> ######################################################################## >> >> Running the compression again on the same file from an SSD with >> hyperthreading turned off, I was able to fully saturate all of the cores >> using lbzip2. None of this seemed obvious at first, but it rectified the >> situation. The biggest difference came from turning off hyperthreading >> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >> hyperthreading turned off (idle CPU = 0%). >> >> Previously, the compression was run from a single HDD, not an SSD. >> Concerning the compression test using the same HDD under FreeBSD, well I >> don't know why it was able to saturate the CPU. Perhaps it has something to >> do with ZFS's aggressive caching. Turning that off and re-running the test >> would likely answer the question. Pixz performed similarly when the above >> two modifications were made. >> >> >> >> >> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >> >>> Have you tried to disable hypertreads in the BIOS ??? >>> It's a long shot, I know, but it might help. >>> >>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>> >>>> Hi, >>>> >>>> On both systems HAMMER was used. One small correction concerning the >>>> 2c/2t machine, both compression programs did effectively utilize that CPU >>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU >>>> isn't effectively maxed out. I'll continue to try and investigate why and >>>> report back if I find anything. >>>> >>>> >>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>> >>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>> surprised if that made a difference, but it might? >>>>> >>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I've observed that parallel compression tools such as pixz and lbzip2 >>>>>> do not >>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it >>>>>> does. >>>>>> >>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>> percentages consistently hover around the 90% range for pixz and ~70% >>>>>> for >>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle >>>>>> until >>>>>> compression is complete. Correspondingly, compression takes >>>>>> significantly >>>>>> longer under Dragonfly, so the CPU is really being under utilized in >>>>>> this >>>>>> case as opposed to erroneous reporting by top. >>>>>> >>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>> recent >>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 11:44:04 >>>>>> EET >>>>>> 2016. >>>>>> >>>>>> Has anyone else possibly observed this? >>>>>> >>>>>> -- >>>>>> Mike >>>>>> >>>>>> >>>>> >>>> >>> >>> > From jasse at yberwaffe.com Mon Dec 12 05:41:40 2016 From: jasse at yberwaffe.com (Jasse Jansson) Date: Mon, 12 Dec 2016 14:41:40 +0100 Subject: Parallel compression not using all available CPU In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> They recommend to turn hyperthreading off if you run studio software on your computer. That's if you run Windows, I have no idea if HT affects a Unix derivative anyway. On 2016-12-12 07:14, PeerCorps Trust Fund wrote: > Thanks for this. > > Is there ever a side case where hyperthreading might have > unpredictable results or should it generally always be left on? > > On 12/12/2016 01:37 AM, Matthew Dillon wrote: >> That doesn't make any sense. It sounds like it is just compressing more >> slowly, so there is less idle time because the HDD/SSD is able to >> keep up >> due to it compressing more slowly. You don't want to turn off >> hyperthreading in the BIOS and cache coherency stalls will not show >> up in >> the idle% anyway. >> >> -Matt >> >> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund < >> ipc at peercorpstrust.org> wrote: >> >>> Hi, >>> >>> It turns out that it was a combination of two things - turning off >>> hyperthreading in BIOS and using a faster disk. >>> >>> I found a post from the author of lbzip2 which seems to describe what >>> might be happening in this case, but reference was made to a user >>> using an >>> i5 mobile CPU: >>> >>> ######################################################################## >>> >>> >>> "bzip2 author here. I strongly suspect that you see what you see >>> because >>> your Intel core i5 is probably only dual core PLUS hyper-threaded, >>> not real >>> quad-core. Meaning, you have two instances of the L2 per-core cache, >>> not >>> four, and each two hyperthreads share an L2 cache. >>> >>> Since the bzip2 compression/decompression is very cache sensitive (see >>> "man bzip2"), the scaling factor will be determined mostly by how many >>> OS-threads can dispose over a dedicated cache each. In your case this >>> number is probably 2. >>> >>> Since you run two threads per core, those contend for the shared L2 >>> cache, >>> basically each messing with the other (flushing / invalidating the >>> shared >>> cache for the other). This contention shows up as double CPU time, >>> because >>> "waiting for cache" (or "waiting for main memory") is accounted for >>> as CPU >>> time. >>> >>> Hyperthreading is not useful but detrimental for lbzip2; so you should >>> export LBZIP2="-n 2". You should not run more worker threads per >>> core than: >>> core-dedicated-cache-size divided by 8MB." >>> >>> ######################################################################## >>> >>> >>> Running the compression again on the same file from an SSD with >>> hyperthreading turned off, I was able to fully saturate all of the >>> cores >>> using lbzip2. None of this seemed obvious at first, but it rectified >>> the >>> situation. The biggest difference came from turning off hyperthreading >>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with >>> hyperthreading turned off (idle CPU = 0%). >>> >>> Previously, the compression was run from a single HDD, not an SSD. >>> Concerning the compression test using the same HDD under FreeBSD, >>> well I >>> don't know why it was able to saturate the CPU. Perhaps it has >>> something to >>> do with ZFS's aggressive caching. Turning that off and re-running >>> the test >>> would likely answer the question. Pixz performed similarly when the >>> above >>> two modifications were made. >>> >>> >>> >>> >>> On 12/11/2016 02:00 AM, Jasse Jansson wrote: >>> >>>> Have you tried to disable hypertreads in the BIOS ??? >>>> It's a long shot, I know, but it might help. >>>> >>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote: >>>> >>>>> Hi, >>>>> >>>>> On both systems HAMMER was used. One small correction concerning the >>>>> 2c/2t machine, both compression programs did effectively utilize >>>>> that CPU >>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t >>>>> where the CPU >>>>> isn't effectively maxed out. I'll continue to try and investigate >>>>> why and >>>>> report back if I find anything. >>>>> >>>>> >>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote: >>>>> >>>>>> On the two DragonFly systems, was it Hammer or UFS? I would be >>>>>> surprised if that made a difference, but it might? >>>>>> >>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I've observed that parallel compression tools such as pixz and >>>>>>> lbzip2 >>>>>>> do not >>>>>>> make use of all of the available CPU under Dragonfly. On other >>>>>>> OSes, it >>>>>>> does. >>>>>>> >>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle >>>>>>> percentages consistently hover around the 90% range for pixz and >>>>>>> ~70% >>>>>>> for >>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% >>>>>>> idle >>>>>>> until >>>>>>> compression is complete. Correspondingly, compression takes >>>>>>> significantly >>>>>>> longer under Dragonfly, so the CPU is really being under >>>>>>> utilized in >>>>>>> this >>>>>>> case as opposed to erroneous reporting by top. >>>>>>> >>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a >>>>>>> recent >>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec 7 >>>>>>> 11:44:04 >>>>>>> EET >>>>>>> 2016. >>>>>>> >>>>>>> Has anyone else possibly observed this? >>>>>>> >>>>>>> -- >>>>>>> Mike >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >> > From dillon at backplane.com Tue Dec 13 12:56:41 2016 From: dillon at backplane.com (Matthew Dillon) Date: Tue, 13 Dec 2016 12:56:41 -0800 Subject: Parallel compression not using all available CPU In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> References: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org> <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com> <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org> <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org> <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com> Message-ID: Never turn hyper-threading off if running (any) BSD or Linux. The schedulers understand hyperthreads. In terms of compression programs, presumably they have an option to specify how many threads to create to do the compression. Simply test with 1, 2, and 4 threads to determine the optimal number. -Matt ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:10:35 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:10:35 -0800 Subject: Graphics-related memory leak fixed in master for post-haswell machines Message-ID: A memory leak has been fixed in master's drm (gpu) code which affects post-haswell intel GPUs. i.e. Broadwell, Skylake, etc. This leak would cause a OOM panic typically after a couple of weeks of uptime. The leak does not affect older Haswell based intel systems. A number of other stability improvements have also gone into the system. The release branch got some of the stability work but the fix for the graphics memory leak is a bit too complex to backport in a compatible manner. At the current juncture, anyone running a workstation with release might want to switch to master. Master will be *much* better on broadwell or later GPUs. Also somewhat related, chromium (chrome) has been updated from v52 to v54 on both branches, which fixes some serious GUI issues on high-dpi (aka 4K) monitors. Chrome 54 still has some issues with certain sites, but generally speaking it seems to be pretty stable and I recommend its use. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Sat Dec 17 12:19:31 2016 From: dillon at backplane.com (Matthew Dillon) Date: Sat, 17 Dec 2016 12:19:31 -0800 Subject: Setting up a more secure browser environment Message-ID: I have created a new page in the handbook describing how to set up a secure browser environment using isolation accounts. We recommend running your browser this way instead of running it directly from your main workstation account. https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/ Currently chrome is considered to be the most secure. Now that chrome 54 is in dports, it should generate a pretty good experience for users. There are some sites which still implode on it (and in FreeBSD too, that isn't a DFly-specific problem), and for those you may have to run firefox. My personal preference is chrome with some adjustments to the setup. For people running chrome on a 4K monitor, I recommend running chrome with the "--force-device-scale-factor=1.5" option before you start messing with font or other settings. It will scale the UI a little closer to what you probably want. DragonFly master's support through Intel Skylake is now very good. I am running a Skylake box as my workstation now and it has been very stable. Theoretically we have some Karby support in there too but no guarantees on stability. Radeon support has also seen some improvement though not at the same pace as Intel GPU support has seen. At this point, DragonFly runs extremely well even on mobile chipsets (though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in terms of performance). I recommend a minimum of 4GB of ram if using X. DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or NVMe SSD and a large swap partition to page to for decent browser performance. Browsers are still big memory hogs. Eventually we will have chrome 55 support which is supposed to eat much less memory, but not yet. chrome 54 is where we are now. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From karu.pruun at gmail.com Mon Dec 19 05:21:56 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Mon, 19 Dec 2016 15:21:56 +0200 Subject: question about hammer mirror-stream Message-ID: Hello A quick question regarding hammer mirror-streaming. The master and slave are mounted as # mount_null /pfs/master /home/master # mount_null /pfs/slave /home/slave and streaming is ongoing with # hammer mirror-stream /home/master /home/slave When there's a change in master, e.g. % sudo touch /home/master/fileX then fileX will show up in /pfs/slave but not in /home/slave. It will show up in the latter when I stop streaming, unmount the slave and mount it again. Is this the intended behavior? Thanks Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From t+dfbsd at timdarby.net Mon Dec 19 08:50:06 2016 From: t+dfbsd at timdarby.net (Tim Darby) Date: Mon, 19 Dec 2016 09:50:06 -0700 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > then fileX will show up in /pfs/slave but not in /home/slave. It will show > up in the latter when I stop streaming, unmount the slave and mount it > again. Is this the intended behavior? > Yes, that's how it works. You won't see the updates to the mounted slave until you unmount and re-mount. Tim? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillon at backplane.com Mon Dec 19 10:35:29 2016 From: dillon at backplane.com (Matthew Dillon) Date: Mon, 19 Dec 2016 10:35:29 -0800 Subject: AHCI driver on master - heads up Message-ID: Some work has gone into the AHCI driver on master. It should not affect anyone, but I'm posting a heads-up because this is the primary SATA storage device used on most machines and the changes made are significant enough that they could conceivably cause an issue. So if you are using master, be sure you have a good backup kernel in case something goes wrong. The easiest way to create a backup kernel is to first install the new kernel. Your previous kernel will be stripped and copied to /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: cpdup /boot/kernel.old /boot/kernel.bak This way you don't waste space in /boot copying kernels and modules with full debug symbols. Having a kernel.bak is a good idea because it will never be overwritten by an installkernel or other build target. It will show up in the loader under 'b' for backup kernel. -- Experimental FIS-Based Switching support has been added to the AHCI driver's port-multiplier code. Most AHCI chipsets do not have this capability and the few that do tend to have fairly buggy hardware, so I am not recommending that port multipliers be used generally. However, after years of waiting I finally got my hands on a mobo that had the feature and I've wanted to implement it forever, so I have done so. The FBS feature allows the computer to queue concurrent commands to multiple targets behind a port-multiplier. Without it, only one target can be addressed at a time. Test results with some old HDDs I had on a shelf can be found here. As expected, concurrent performance to multiple targets is vastly improved. http://apollo.backplane.com/DFlyMisc/portmult01.txt However, I must again stress that AHCI-based port-multipliers tend to fall-over when errors occur, or even if you are just hot-swapping a drive sometimes. Using a port-multiplier is not a good solution for serious setups. Blame Intel for intentionally screwing up the consumer AHCI spec in order to prevent it from competing with commercial SAS controllers. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From ipc at peercorpstrust.org Mon Dec 19 11:14:07 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 19 Dec 2016 21:14:07 +0200 Subject: AHCI driver on master - heads up In-Reply-To: References: Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org> This is interesting. AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you! On 12/19/2016 08:35 PM, Matthew Dillon wrote: > Some work has gone into the AHCI driver on master. It should not affect > anyone, but I'm posting a heads-up because this is the primary SATA storage > device used on most machines and the changes made are significant enough > that they could conceivably cause an issue. So if you are using master, be > sure you have a good backup kernel in case something goes wrong. > > The easiest way to create a backup kernel is to first install the new > kernel. Your previous kernel will be stripped and copied to > /boot/kernel.old. Then copy /boot/kernel.old to /boot/kernel.bak using: > > cpdup /boot/kernel.old /boot/kernel.bak > > This way you don't waste space in /boot copying kernels and modules with > full debug symbols. Having a kernel.bak is a good idea because it will > never be overwritten by an installkernel or other build target. It will > show up in the loader under 'b' for backup kernel. > > -- > > Experimental FIS-Based Switching support has been added to the AHCI > driver's port-multiplier code. Most AHCI chipsets do not have this > capability and the few that do tend to have fairly buggy hardware, so I am > not recommending that port multipliers be used generally. However, after > years of waiting I finally got my hands on a mobo that had the feature and > I've wanted to implement it forever, so I have done so. > > The FBS feature allows the computer to queue concurrent commands to > multiple targets behind a port-multiplier. Without it, only one target can > be addressed at a time. Test results with some old HDDs I had on a shelf > can be found here. As expected, concurrent performance to multiple targets > is vastly improved. > > http://apollo.backplane.com/DFlyMisc/portmult01.txt > > However, I must again stress that AHCI-based port-multipliers tend to > fall-over when errors occur, or even if you are just hot-swapping a drive > sometimes. Using a port-multiplier is not a good solution for serious > setups. Blame Intel for intentionally screwing up the consumer AHCI spec > in order to prevent it from competing with commercial SAS controllers. > > -Matt > From karu.pruun at gmail.com Thu Dec 22 05:11:17 2016 From: karu.pruun at gmail.com (karu.pruun) Date: Thu, 22 Dec 2016 15:11:17 +0200 Subject: question about hammer mirror-stream In-Reply-To: References: Message-ID: On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby wrote: > On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun wrote: > >> then fileX will show up in /pfs/slave but not in /home/slave. It will >> show up in the latter when I stop streaming, unmount the slave and mount it >> again. Is this the intended behavior? >> > > Yes, that's how it works. You won't see the updates to the mounted slave > until you unmount and re-mount. > Thanks! Peeter -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent.delft at gmail.com Mon Dec 26 03:30:12 2016 From: vincent.delft at gmail.com (vincent delft) Date: Mon, 26 Dec 2016 12:30:12 +0100 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP Message-ID: Hello, First of all I'm new to DF. Thanks for the nice install process. It's a quite easy to have DF running. But my concerns are more linked to HammerFS. My context is the following: - I would like to have a NAS system running on a small machine: 4GB RAM having a celeron CPU of 1.6GHz. - This NAS host +- 700GB of data - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB outside the NAS for DRP reasons. - I'm looking for a solution tackling the bit-rot problem. Concerning HammerFS, I've still read lot of DF manuals, pages, ... : http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html https://www.dragonflybsd.org/hammer/ So basically, I've understood that I must install DF on a small SSD disk and will dedicated the 1TB disk to HammerFS. But I have some questions: - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 slices on the disk? - For DRP reasons, how can I perform a backup on an external disk ? (the disk is only connected 1x per month on the NAS). Should I do a "dd" ? should I use the master-slave concepts ? Should I use cpdup ? Many thanks for your replies. Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From kusumi.tomohiro at gmail.com Mon Dec 26 04:15:08 2016 From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi) Date: Mon, 26 Dec 2016 21:15:08 +0900 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: In short, hammer doesn't solve (or fix) what you call bit-rot. 2016-12-26 20:30 GMT+09:00 vincent delft : > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk and > will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1 > or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ? > Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2 > slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? should > I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > > From ipc at peercorpstrust.org Mon Dec 26 05:06:24 2016 From: ipc at peercorpstrust.org (PeerCorps Trust Fund) Date: Mon, 26 Dec 2016 15:06:24 +0200 Subject: what is the best Hammer setup to taclke the bitrot problem + DRP In-Reply-To: References: Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org> Hi Vincent, Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine. Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on. HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication. Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company: https://bsdmag.org/siju_george/ HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs. There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/ In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also. If any of the above is in error, I imagine someone with deeper insight will chime in. Mike On 12/26/2016 01:30 PM, vincent delft wrote: > Hello, > > First of all I'm new to DF. > Thanks for the nice install process. It's a quite easy to have DF running. > But my concerns are more linked to HammerFS. > > My context is the following: > - I would like to have a NAS system running on a small machine: 4GB RAM > having a celeron CPU of 1.6GHz. > - This NAS host +- 700GB of data > - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB > outside the NAS for DRP reasons. > - I'm looking for a solution tackling the bit-rot problem. > > > Concerning HammerFS, I've still read lot of DF manuals, pages, ... : > > http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html > http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html > http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html > http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html > https://www.dragonflybsd.org/hammer/ > > So basically, I've understood that I must install DF on a small SSD disk > and will dedicated the 1TB disk to HammerFS. > > But I have some questions: > - What must be the setup of HammerFS to solve "bit-rot" ? Should it be > Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2 > filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or > should I create 2 slices on the disk? > > - For DRP reasons, how can I perform a backup on an external disk ? (the > disk is only connected 1x per month on the NAS). Should I do a "dd" ? > should I use the master-slave concepts ? Should I use cpdup ? > > Many thanks for your replies. > > Vincent > From dillon at backplane.com Wed Dec 28 14:32:55 2016 From: dillon at backplane.com (Matthew Dillon) Date: Wed, 28 Dec 2016 14:32:55 -0800 Subject: Swap changes in master, enforcement of RLIMIT_RSS Message-ID: The maximum supported swap space has been significantly increased in master. The amount of core memory required to manage swap is approximately 1:128 (ram:swap-used). The system allows up to half of your core ram to be used for swap management so the maximum swap you can actually use on a machine would be as follows: With 1GB of ram it would be 512M*128 = 64GB of swap. With 8GB of ram it would be 4G*128 = 512GB of swap. With 128GB of ram it would be around ~8TB of swap. The maximum the system supports due to KVM limitations is around ~32TB of swap. While you can configure more swap than the above limitations, the system might not be able to actually manage as much as you configure. I'm considering improving the management code to reduce its memory requirements. Also, people shouldn't go overboard w/regards to configuring swap space. Configured swap does eat some memory resources even when not actively in-use, and there might not be a practical use for having such massive amounts of swap with your setup. -- The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced on a per-process basis. By default this limit is infinity so nothing will change for most people. Setting a memoryuse limit has advantages in certain circumstances as long as it isn't over-used. For example, it might be reasonable to start your browser with a 500m (500 megabyte) limit. The system will allow each process VSZ to grow larger, but will force the RSS to the limit and devalue the pages it removes to make them more likely to be recycled by the system. There is a sysctl, vm.pageout_memuse_mode, to control the behavior. It can be set to 0, 1, or 2, and defaults to 1. Setting it to 0 disables the enforcement (system operates the same as it did before the commit). The default value of 1 enables enforcement by removing pages from the process pmap and deactivating them passively. The value 2 enables enforcement and actively pages data out and frees pages. Mode 2 can cause excessive paging and wear out your SSD so it is not currently recommended. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: