From t+dfbsd at timdarby.net  Thu Dec  1 08:27:40 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Thu, 1 Dec 2016 09:27:40 -0700
Subject: Notice to Golang users
Message-ID: <CAP3DVK+RYA_MjvSnLwwXD1FjsKbi1dJkipLCPS+kJXUTf-v08Q@mail.gmail.com>

If you use the Go compiler with Dragonfly (it's in dports), be aware that
the upcoming release of Go 1.8 will require DF 4.4.4 or later:

https://beta.golang.org/doc/go1.8

?The Go team had put a workaround in place for DF for a thread signal
handling bug. This was recently fixed by Matt, so they've removed the
workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you
can use Go 1.7 or earlier.

Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161201/6aa509a7/attachment.html>

From venture37 at geeklan.co.uk  Sun Dec  4 07:56:48 2016
From: venture37 at geeklan.co.uk (Sevan Janiyan)
Date: Sun, 4 Dec 2016 15:56:48 +0000
Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?=
 =?UTF-8?Q?icipation_extended?=
Message-ID: <e0ee874b-d0a4-ff74-210c-4a913f653227@geeklan.co.uk>


Hi,

The BSD devroom move the submission deadline to Dec 10th.
There's still time to submit a talk !

Regards,
- rodrigo
On behalf of the BSD devroom

Fosdem 2017 BSD devroom Call for Participation
==============================================

Important dates
--------------------
* Conference date      : 4 & 5 February 2017 in Brussels, Belgium
* Devroom date         : Sunday 5 February 2016
* Submission deadline  : Sunday 27 November 2016
* Speaker notified     : Sunday 4 December 2016

The topic of the devroom includes all BSD operating systems and every
talk is welcome from hacker discussions to real-word examples and
presentations about new and shiny features.

Practical
--------------------
* The default duration for talks will be 45 minutes including
   discussion. Feel free to ask if you want to have a longer or
   a shorter slot.

* Presentations can be recorded and streamed, sending your proposal
   implies giving permission to be recorded. However, exceptions can
   be made for exceptional circumstances.


To submit your proposal, visit :
https://penta.fosdem.org/submission/FOSDEM17

Account creation
----------------------
If you already have a Pentabarf account, please *don't* recreate a new
one. If you forgot your password, reset it. If not, follow the
instructions to create an account.

Submit a talk
----------------------
Create an ?event? and click on "Show all" in the top right corner
to display the full form.

Your submission must include the following information

* The title and subtitle of your talk (please be descriptive, as titles
   will be listed with ~500 from other projects)
* select ?BSD devroom? as the track.
* A short abstract of one paragraph
* A longer description if you wish to do so
* Links to related websites/blogs etc.

- Rodrigo Osorio
On behalf of the BSD Devroom


From ipc at peercorpstrust.org  Sat Dec 10 03:19:14 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 13:19:14 +0200
Subject: Parallel compression not using all available CPU
Message-ID: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>

Hi,

I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does.

When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top.

This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET 2016.

Has anyone else possibly observed this?

-- 
Mike


From justin at shiningsilence.com  Sat Dec 10 12:26:32 2016
From: justin at shiningsilence.com (Justin Sherrill)
Date: Sat, 10 Dec 2016 15:26:32 -0500
Subject: Parallel compression not using all available CPU
In-Reply-To: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
Message-ID: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>

On the two DragonFly systems, was it Hammer or UFS?  I would be
surprised if that made a difference, but it might?

On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
<ipc at peercorpstrust.org> wrote:
> Hi,
>
> I've observed that parallel compression tools such as pixz and lbzip2 do not
> make use of all of the available CPU under Dragonfly. On other OSes, it
> does.
>
> When testing on a 50 gb file, using top I've observed that CPU idle
> percentages consistently hover around the 90% range for pixz and ~70% for
> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
> compression is complete. Correspondingly, compression takes significantly
> longer under Dragonfly, so the CPU is really being under utilized in this
> case as opposed to erroneous reporting by top.
>
> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
> 2016.
>
> Has anyone else possibly observed this?
>
> --
> Mike
>

From ipc at peercorpstrust.org  Sat Dec 10 13:14:26 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 23:14:26 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>

Hi,

On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.


On 12/10/2016 10:26 PM, Justin Sherrill wrote:
> On the two DragonFly systems, was it Hammer or UFS?  I would be
> surprised if that made a difference, but it might?
>
> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
> <ipc at peercorpstrust.org> wrote:
>> Hi,
>>
>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>> make use of all of the available CPU under Dragonfly. On other OSes, it
>> does.
>>
>> When testing on a 50 gb file, using top I've observed that CPU idle
>> percentages consistently hover around the 90% range for pixz and ~70% for
>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>> compression is complete. Correspondingly, compression takes significantly
>> longer under Dragonfly, so the CPU is really being under utilized in this
>> case as opposed to erroneous reporting by top.
>>
>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>> 2016.
>>
>> Has anyone else possibly observed this?
>>
>> --
>> Mike
>>
>

From dillon at apollo.backplane.com  Sat Dec 10 13:29:24 2016
From: dillon at apollo.backplane.com (Matthew Dillon)
Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST)
Subject: Parallel compression not using all available CPU
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com>


:Hi,
:
:On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.

    Generally speaking I wouldn't want these utilities to use all available
    cpu threads by default.  It's not usually a good idea for a program to
    just assume that all the cpus are there for it (by default).  But they
    should have command line options that allow the number of threads to
    be specified.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>

From jasse at yberwaffe.com  Sat Dec 10 16:00:02 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Sun, 11 Dec 2016 01:00:02 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>

Have you tried to disable hypertreads in the BIOS ???
It's a long shot, I know, but it might help.

On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
> Hi,
>
> On both systems HAMMER was used. One small correction concerning the 
> 2c/2t machine, both compression programs did effectively utilize that 
> CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t 
> where the CPU isn't effectively maxed out. I'll continue to try and 
> investigate why and report back if I find anything.
>
>
> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>> surprised if that made a difference, but it might?
>>
>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>> <ipc at peercorpstrust.org> wrote:
>>> Hi,
>>>
>>> I've observed that parallel compression tools such as pixz and 
>>> lbzip2 do not
>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>> does.
>>>
>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>> percentages consistently hover around the 90% range for pixz and 
>>> ~70% for
>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>> idle until
>>> compression is complete. Correspondingly, compression takes 
>>> significantly
>>> longer under Dragonfly, so the CPU is really being under utilized in 
>>> this
>>> case as opposed to erroneous reporting by top.
>>>
>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a 
>>> recent
>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>> 11:44:04 EET
>>> 2016.
>>>
>>> Has anyone else possibly observed this?
>>>
>>> -- 
>>> Mike
>>>
>>
>


From ipc at peercorpstrust.org  Sun Dec 11 13:22:44 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sun, 11 Dec 2016 23:22:44 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>

Hi,

It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk.

I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU:

########################################################################

"bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache.

Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2.

Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time.

Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB."

########################################################################

Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%).

Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made.


On 12/11/2016 02:00 AM, Jasse Jansson wrote:
> Have you tried to disable hypertreads in the BIOS ???
> It's a long shot, I know, but it might help.
>
> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>> Hi,
>>
>> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.
>>
>>
>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>> surprised if that made a difference, but it might?
>>>
>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>> <ipc at peercorpstrust.org> wrote:
>>>> Hi,
>>>>
>>>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>> does.
>>>>
>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>> percentages consistently hover around the 90% range for pixz and ~70% for
>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>>>> compression is complete. Correspondingly, compression takes significantly
>>>> longer under Dragonfly, so the CPU is really being under utilized in this
>>>> case as opposed to erroneous reporting by top.
>>>>
>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>>>> 2016.
>>>>
>>>> Has anyone else possibly observed this?
>>>>
>>>> --
>>>> Mike
>>>>
>>>
>>
>
>

From dillon at backplane.com  Sun Dec 11 15:37:36 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sun, 11 Dec 2016 15:37:36 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
Message-ID: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>

That doesn't make any sense.  It sounds like it is just compressing more
slowly, so there is less idle time because the HDD/SSD is able to keep up
due to it compressing more slowly.  You don't want to turn off
hyperthreading in the BIOS and cache coherency stalls will not show up in
the idle% anyway.

-Matt

On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
ipc at peercorpstrust.org> wrote:

> Hi,
>
> It turns out that it was a combination of two things - turning off
> hyperthreading in BIOS and using a faster disk.
>
> I found a post from the author of lbzip2 which seems to describe what
> might be happening in this case, but reference was made to a user using an
> i5 mobile CPU:
>
> ########################################################################
>
> "bzip2 author here. I strongly suspect that you see what you see because
> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
> quad-core. Meaning, you have two instances of the L2 per-core cache, not
> four, and each two hyperthreads share an L2 cache.
>
> Since the bzip2 compression/decompression is very cache sensitive (see
> "man bzip2"), the scaling factor will be determined mostly by how many
> OS-threads can dispose over a dedicated cache each. In your case this
> number is probably 2.
>
> Since you run two threads per core, those contend for the shared L2 cache,
> basically each messing with the other (flushing / invalidating the shared
> cache for the other). This contention shows up as double CPU time, because
> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
> time.
>
> Hyperthreading is not useful but detrimental for lbzip2; so you should
> export LBZIP2="-n 2". You should not run more worker threads per core than:
> core-dedicated-cache-size divided by 8MB."
>
> ########################################################################
>
> Running the compression again on the same file from an SSD with
> hyperthreading turned off, I was able to fully saturate all of the cores
> using lbzip2. None of this seemed obvious at first, but it rectified the
> situation. The biggest difference came from turning off hyperthreading
> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
> hyperthreading turned off (idle CPU = 0%).
>
> Previously, the compression was run from a single HDD, not an SSD.
> Concerning the compression test using the same HDD under FreeBSD, well I
> don't know why it was able to saturate the CPU. Perhaps it has something to
> do with ZFS's aggressive caching. Turning that off and re-running the test
> would likely answer the question. Pixz performed similarly when the above
> two modifications were made.
>
>
>
>
> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>
>> Have you tried to disable hypertreads in the BIOS ???
>> It's a long shot, I know, but it might help.
>>
>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>
>>> Hi,
>>>
>>> On both systems HAMMER was used. One small correction concerning the
>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>> report back if I find anything.
>>>
>>>
>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>
>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>> surprised if that made a difference, but it might?
>>>>
>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>> <ipc at peercorpstrust.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>> do not
>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>> does.
>>>>>
>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>> for
>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>> until
>>>>> compression is complete. Correspondingly, compression takes
>>>>> significantly
>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>> this
>>>>> case as opposed to erroneous reporting by top.
>>>>>
>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>> recent
>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>> EET
>>>>> 2016.
>>>>>
>>>>> Has anyone else possibly observed this?
>>>>>
>>>>> --
>>>>> Mike
>>>>>
>>>>>
>>>>
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161211/dbbb00e9/attachment.html>

From ipc at peercorpstrust.org  Sun Dec 11 22:14:29 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 12 Dec 2016 08:14:29 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>

Thanks for this.

Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on?

On 12/12/2016 01:37 AM, Matthew Dillon wrote:
> That doesn't make any sense.  It sounds like it is just compressing more
> slowly, so there is less idle time because the HDD/SSD is able to keep up
> due to it compressing more slowly.  You don't want to turn off
> hyperthreading in the BIOS and cache coherency stalls will not show up in
> the idle% anyway.
>
> -Matt
>
> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
> ipc at peercorpstrust.org> wrote:
>
>> Hi,
>>
>> It turns out that it was a combination of two things - turning off
>> hyperthreading in BIOS and using a faster disk.
>>
>> I found a post from the author of lbzip2 which seems to describe what
>> might be happening in this case, but reference was made to a user using an
>> i5 mobile CPU:
>>
>> ########################################################################
>>
>> "bzip2 author here. I strongly suspect that you see what you see because
>> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
>> quad-core. Meaning, you have two instances of the L2 per-core cache, not
>> four, and each two hyperthreads share an L2 cache.
>>
>> Since the bzip2 compression/decompression is very cache sensitive (see
>> "man bzip2"), the scaling factor will be determined mostly by how many
>> OS-threads can dispose over a dedicated cache each. In your case this
>> number is probably 2.
>>
>> Since you run two threads per core, those contend for the shared L2 cache,
>> basically each messing with the other (flushing / invalidating the shared
>> cache for the other). This contention shows up as double CPU time, because
>> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
>> time.
>>
>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>> export LBZIP2="-n 2". You should not run more worker threads per core than:
>> core-dedicated-cache-size divided by 8MB."
>>
>> ########################################################################
>>
>> Running the compression again on the same file from an SSD with
>> hyperthreading turned off, I was able to fully saturate all of the cores
>> using lbzip2. None of this seemed obvious at first, but it rectified the
>> situation. The biggest difference came from turning off hyperthreading
>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>> hyperthreading turned off (idle CPU = 0%).
>>
>> Previously, the compression was run from a single HDD, not an SSD.
>> Concerning the compression test using the same HDD under FreeBSD, well I
>> don't know why it was able to saturate the CPU. Perhaps it has something to
>> do with ZFS's aggressive caching. Turning that off and re-running the test
>> would likely answer the question. Pixz performed similarly when the above
>> two modifications were made.
>>
>>
>>
>>
>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>
>>> Have you tried to disable hypertreads in the BIOS ???
>>> It's a long shot, I know, but it might help.
>>>
>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>
>>>> Hi,
>>>>
>>>> On both systems HAMMER was used. One small correction concerning the
>>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>>> report back if I find anything.
>>>>
>>>>
>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>
>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>> surprised if that made a difference, but it might?
>>>>>
>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>>> do not
>>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>>> does.
>>>>>>
>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>>> for
>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>>> until
>>>>>> compression is complete. Correspondingly, compression takes
>>>>>> significantly
>>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>>> this
>>>>>> case as opposed to erroneous reporting by top.
>>>>>>
>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>> recent
>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>>> EET
>>>>>> 2016.
>>>>>>
>>>>>> Has anyone else possibly observed this?
>>>>>>
>>>>>> --
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>

From jasse at yberwaffe.com  Mon Dec 12 05:41:40 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Mon, 12 Dec 2016 14:41:40 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>

They recommend to turn hyperthreading off if you run studio software on 
your computer.
That's if you run Windows, I have no idea if HT affects a Unix 
derivative anyway.

On 2016-12-12 07:14, PeerCorps Trust Fund wrote:
> Thanks for this.
>
> Is there ever a side case where hyperthreading might have 
> unpredictable results or should it generally always be left on?
>
> On 12/12/2016 01:37 AM, Matthew Dillon wrote:
>> That doesn't make any sense.  It sounds like it is just compressing more
>> slowly, so there is less idle time because the HDD/SSD is able to 
>> keep up
>> due to it compressing more slowly.  You don't want to turn off
>> hyperthreading in the BIOS and cache coherency stalls will not show 
>> up in
>> the idle% anyway.
>>
>> -Matt
>>
>> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
>> ipc at peercorpstrust.org> wrote:
>>
>>> Hi,
>>>
>>> It turns out that it was a combination of two things - turning off
>>> hyperthreading in BIOS and using a faster disk.
>>>
>>> I found a post from the author of lbzip2 which seems to describe what
>>> might be happening in this case, but reference was made to a user 
>>> using an
>>> i5 mobile CPU:
>>>
>>> ######################################################################## 
>>>
>>>
>>> "bzip2 author here. I strongly suspect that you see what you see 
>>> because
>>> your Intel core i5 is probably only dual core PLUS hyper-threaded, 
>>> not real
>>> quad-core. Meaning, you have two instances of the L2 per-core cache, 
>>> not
>>> four, and each two hyperthreads share an L2 cache.
>>>
>>> Since the bzip2 compression/decompression is very cache sensitive (see
>>> "man bzip2"), the scaling factor will be determined mostly by how many
>>> OS-threads can dispose over a dedicated cache each. In your case this
>>> number is probably 2.
>>>
>>> Since you run two threads per core, those contend for the shared L2 
>>> cache,
>>> basically each messing with the other (flushing / invalidating the 
>>> shared
>>> cache for the other). This contention shows up as double CPU time, 
>>> because
>>> "waiting for cache" (or "waiting for main memory") is accounted for 
>>> as CPU
>>> time.
>>>
>>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>>> export LBZIP2="-n 2". You should not run more worker threads per 
>>> core than:
>>> core-dedicated-cache-size divided by 8MB."
>>>
>>> ######################################################################## 
>>>
>>>
>>> Running the compression again on the same file from an SSD with
>>> hyperthreading turned off, I was able to fully saturate all of the 
>>> cores
>>> using lbzip2. None of this seemed obvious at first, but it rectified 
>>> the
>>> situation. The biggest difference came from turning off hyperthreading
>>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>>> hyperthreading turned off (idle CPU = 0%).
>>>
>>> Previously, the compression was run from a single HDD, not an SSD.
>>> Concerning the compression test using the same HDD under FreeBSD, 
>>> well I
>>> don't know why it was able to saturate the CPU. Perhaps it has 
>>> something to
>>> do with ZFS's aggressive caching. Turning that off and re-running 
>>> the test
>>> would likely answer the question. Pixz performed similarly when the 
>>> above
>>> two modifications were made.
>>>
>>>
>>>
>>>
>>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>>
>>>> Have you tried to disable hypertreads in the BIOS ???
>>>> It's a long shot, I know, but it might help.
>>>>
>>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> On both systems HAMMER was used. One small correction concerning the
>>>>> 2c/2t machine, both compression programs did effectively utilize 
>>>>> that CPU
>>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t 
>>>>> where the CPU
>>>>> isn't effectively maxed out. I'll continue to try and investigate 
>>>>> why and
>>>>> report back if I find anything.
>>>>>
>>>>>
>>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>>
>>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>>> surprised if that made a difference, but it might?
>>>>>>
>>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've observed that parallel compression tools such as pixz and 
>>>>>>> lbzip2
>>>>>>> do not
>>>>>>> make use of all of the available CPU under Dragonfly. On other 
>>>>>>> OSes, it
>>>>>>> does.
>>>>>>>
>>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>>> percentages consistently hover around the 90% range for pixz and 
>>>>>>> ~70%
>>>>>>> for
>>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>>>>>> idle
>>>>>>> until
>>>>>>> compression is complete. Correspondingly, compression takes
>>>>>>> significantly
>>>>>>> longer under Dragonfly, so the CPU is really being under 
>>>>>>> utilized in
>>>>>>> this
>>>>>>> case as opposed to erroneous reporting by top.
>>>>>>>
>>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>>> recent
>>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>>>>>> 11:44:04
>>>>>>> EET
>>>>>>> 2016.
>>>>>>>
>>>>>>> Has anyone else possibly observed this?
>>>>>>>
>>>>>>> -- 
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>


From dillon at backplane.com  Tue Dec 13 12:56:41 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Tue, 13 Dec 2016 12:56:41 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
 <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
Message-ID: <CAOZ7CpDz_zKSr4OfQTQWcGApG=mNsqGv7gO4b90BJCzrohb5dg@mail.gmail.com>

Never turn hyper-threading off if running (any) BSD or Linux.  The
schedulers understand hyperthreads.

In terms of compression programs, presumably they have an option to specify
how many threads to create to do the compression.  Simply test with 1, 2,
and 4 threads to determine the optimal number.

-Matt
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161213/0eb2e7f6/attachment.html>

From dillon at backplane.com  Sat Dec 17 12:10:35 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:10:35 -0800
Subject: Graphics-related memory leak fixed in master for post-haswell machines
Message-ID: <CAOZ7CpDurD7AD-3E5sGnZBBCOPLt+0PPoDg=H6mV7EXivCT8Kw@mail.gmail.com>

A memory leak has been fixed in master's drm (gpu) code which affects
post-haswell intel GPUs.  i.e. Broadwell, Skylake, etc.  This leak would
cause a OOM panic typically after a couple of weeks of uptime.  The leak
does not affect older Haswell based intel systems.  A number of other
stability improvements have also gone into the system.

The release branch got some of the stability work but the fix for the
graphics memory leak is a bit too complex to backport in a compatible
manner.  At the current juncture, anyone running a workstation with release
might want to switch to master.  Master will be *much* better on broadwell
or later GPUs.

Also somewhat related, chromium (chrome) has been updated from v52 to v54
on both branches, which fixes some serious GUI issues on high-dpi (aka 4K)
monitors.  Chrome 54 still has some issues with certain sites, but
generally speaking it seems to be pretty stable and I recommend its use.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/1da04d0a/attachment.html>

From dillon at backplane.com  Sat Dec 17 12:19:31 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:19:31 -0800
Subject: Setting up a more secure browser environment
Message-ID: <CAOZ7CpAYYTK_QsAfQ4vEKT6BLSnBtm04+88XJ=NPemwoW187Pw@mail.gmail.com>

I have created a new page in the handbook describing how to set up a secure
browser environment using isolation accounts.  We recommend running your
browser this way instead of running it directly from your main workstation
account.

https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/

Currently chrome is considered to be the most secure.  Now that chrome 54
is in dports, it should generate a pretty good experience for users.  There
are some sites which still implode on it (and in FreeBSD too, that isn't a
DFly-specific problem), and for those you may have to run firefox.  My
personal preference is chrome with some adjustments to the setup.

For people running chrome on a 4K monitor, I recommend running chrome with
the "--force-device-scale-factor=1.5" option before you start messing with
font or other settings.  It will scale the UI a little closer to what you
probably want.

DragonFly master's support through Intel Skylake is now very good.  I am
running a Skylake box as my workstation now and it has been very stable.
Theoretically we have some Karby support in there too but no guarantees on
stability.  Radeon support has also seen some improvement though not at the
same pace as Intel GPU support has seen.

At this point, DragonFly runs extremely well even on mobile chipsets
(though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in
terms of performance).  I recommend a minimum of 4GB of ram if using X.
DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or
NVMe SSD and a large swap partition to page to for decent browser
performance.  Browsers are still big memory hogs.  Eventually we will have
chrome 55 support which is supposed to eat much less memory, but not yet.
 chrome 54 is where we are now.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/e050f279/attachment.html>

From karu.pruun at gmail.com  Mon Dec 19 05:21:56 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Mon, 19 Dec 2016 15:21:56 +0200
Subject: question about hammer mirror-stream
Message-ID: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>

Hello

A quick question regarding hammer mirror-streaming. The master and slave
are mounted as

# mount_null /pfs/master /home/master
# mount_null /pfs/slave /home/slave

and streaming is ongoing with

# hammer mirror-stream /home/master /home/slave

When there's a change in master, e.g.

% sudo touch /home/master/fileX

then fileX will show up in /pfs/slave but not in /home/slave. It will show
up in the latter when I stop streaming, unmount the slave and mount it
again. Is this the intended behavior?

Thanks

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/e2caa040/attachment.html>

From t+dfbsd at timdarby.net  Mon Dec 19 08:50:06 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Mon, 19 Dec 2016 09:50:06 -0700
Subject: question about hammer mirror-stream
In-Reply-To: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
Message-ID: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:

> then fileX will show up in /pfs/slave but not in /home/slave. It will show
> up in the latter when I stop streaming, unmount the slave and mount it
> again. Is this the intended behavior?
>

Yes, that's how it works. You won't see the updates to the mounted slave
until you unmount and re-mount.

Tim?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/07448b6c/attachment.html>

From dillon at backplane.com  Mon Dec 19 10:35:29 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Mon, 19 Dec 2016 10:35:29 -0800
Subject: AHCI driver on master - heads up
Message-ID: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>

Some work has gone into the AHCI driver on master.  It should not affect
anyone, but I'm posting a heads-up because this is the primary SATA storage
device used on most machines and the changes made are significant enough
that they could conceivably cause an issue.  So if you are using master, be
sure you have a good backup kernel in case something goes wrong.

The easiest way to create a backup kernel is to first install the new
kernel.  Your previous kernel will be stripped and copied to
/boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:

cpdup /boot/kernel.old /boot/kernel.bak

This way you don't waste space in /boot copying kernels and modules with
full debug symbols.  Having a kernel.bak is a good idea because it will
never be overwritten by an installkernel or other build target.  It will
show up in the loader under 'b' for backup kernel.

--

Experimental FIS-Based Switching support has been added to the AHCI
driver's port-multiplier code.  Most AHCI chipsets do not have this
capability and the few that do tend to have fairly buggy hardware, so I am
not recommending that port multipliers be used generally.  However, after
years of waiting I finally got my hands on a mobo that had the feature and
I've wanted to implement it forever, so I have done so.

The FBS feature allows the computer to queue concurrent commands to
multiple targets behind a port-multiplier.  Without it, only one target can
be addressed at a time.  Test results with some old HDDs I had on a shelf
can be found here.  As expected, concurrent performance to multiple targets
is vastly improved.

http://apollo.backplane.com/DFlyMisc/portmult01.txt

However, I must again stress that AHCI-based port-multipliers tend to
fall-over when errors occur, or even if you are just hot-swapping a drive
sometimes.  Using a port-multiplier is not a good solution for serious
setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
in order to prevent it from competing with commercial SAS controllers.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/ba061b2e/attachment.html>

From ipc at peercorpstrust.org  Mon Dec 19 11:14:07 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 19 Dec 2016 21:14:07 +0200
Subject: AHCI driver on master - heads up
In-Reply-To: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
References: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org>

This is interesting.

AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you!


On 12/19/2016 08:35 PM, Matthew Dillon wrote:
> Some work has gone into the AHCI driver on master.  It should not affect
> anyone, but I'm posting a heads-up because this is the primary SATA storage
> device used on most machines and the changes made are significant enough
> that they could conceivably cause an issue.  So if you are using master, be
> sure you have a good backup kernel in case something goes wrong.
>
> The easiest way to create a backup kernel is to first install the new
> kernel.  Your previous kernel will be stripped and copied to
> /boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:
>
> cpdup /boot/kernel.old /boot/kernel.bak
>
> This way you don't waste space in /boot copying kernels and modules with
> full debug symbols.  Having a kernel.bak is a good idea because it will
> never be overwritten by an installkernel or other build target.  It will
> show up in the loader under 'b' for backup kernel.
>
> --
>
> Experimental FIS-Based Switching support has been added to the AHCI
> driver's port-multiplier code.  Most AHCI chipsets do not have this
> capability and the few that do tend to have fairly buggy hardware, so I am
> not recommending that port multipliers be used generally.  However, after
> years of waiting I finally got my hands on a mobo that had the feature and
> I've wanted to implement it forever, so I have done so.
>
> The FBS feature allows the computer to queue concurrent commands to
> multiple targets behind a port-multiplier.  Without it, only one target can
> be addressed at a time.  Test results with some old HDDs I had on a shelf
> can be found here.  As expected, concurrent performance to multiple targets
> is vastly improved.
>
> http://apollo.backplane.com/DFlyMisc/portmult01.txt
>
> However, I must again stress that AHCI-based port-multipliers tend to
> fall-over when errors occur, or even if you are just hot-swapping a drive
> sometimes.  Using a port-multiplier is not a good solution for serious
> setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
> in order to prevent it from competing with commercial SAS controllers.
>
> -Matt
>

From karu.pruun at gmail.com  Thu Dec 22 05:11:17 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Thu, 22 Dec 2016 15:11:17 +0200
Subject: question about hammer mirror-stream
In-Reply-To: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
 <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
Message-ID: <CADdF=MJsa-aDAGW6p2Fbx_ryD0TanKXASmUANDjXG5hVrEsc8w@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby <t+dfbsd at timdarby.net> wrote:

> On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:
>
>> then fileX will show up in /pfs/slave but not in /home/slave. It will
>> show up in the latter when I stop streaming, unmount the slave and mount it
>> again. Is this the intended behavior?
>>
>
> Yes, that's how it works. You won't see the updates to the mounted slave
> until you unmount and re-mount.
>


Thanks!

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161222/58827d29/attachment.html>

From vincent.delft at gmail.com  Mon Dec 26 03:30:12 2016
From: vincent.delft at gmail.com (vincent delft)
Date: Mon, 26 Dec 2016 12:30:12 +0100
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
Message-ID: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>

Hello,

First of all I'm new to DF.
Thanks for the nice install process. It's a quite easy to have DF running.
But my concerns are more linked to HammerFS.

My context is the following:
- I would like to have a NAS system running on a small machine: 4GB RAM
having a celeron CPU of 1.6GHz.
- This NAS host +- 700GB of data
- I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
outside the NAS for DRP reasons.
- I'm looking for a solution tackling the bit-rot problem.


Concerning HammerFS, I've still read lot of DF manuals, pages, ... :

http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
https://www.dragonflybsd.org/hammer/

So basically, I've understood that I must install DF on a small SSD disk
and will dedicated the 1TB disk to HammerFS.

But I have some questions:
- What must be the setup of HammerFS to solve "bit-rot" ? Should it be
Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
should I create 2 slices on the disk?

- For DRP reasons, how can I perform a backup on an external disk ? (the
disk is only connected 1x per month on the NAS). Should I do a "dd" ?
should I use the master-slave concepts ? Should I use cpdup ?

Many thanks for your replies.

Vincent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161226/81f21ada/attachment.html>

From kusumi.tomohiro at gmail.com  Mon Dec 26 04:15:08 2016
From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi)
Date: Mon, 26 Dec 2016 21:15:08 +0900
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <CAHndrBnCMCQ+3vb4HP2Ey+JEMfE_eGvZOKJ8UuObLA2rfg7B8A@mail.gmail.com>

In short, hammer doesn't solve (or fix) what you call bit-rot.


2016-12-26 20:30 GMT+09:00 vincent delft <vincent.delft at gmail.com>:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk and
> will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1
> or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ?
> Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2
> slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ? should
> I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>
>

From ipc at peercorpstrust.org  Mon Dec 26 05:06:24 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 26 Dec 2016 15:06:24 +0200
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org>

Hi Vincent,

Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine.

Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on.

HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication.

Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company:
https://bsdmag.org/siju_george/

HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs.

There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer

You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/

In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also.

If any of the above is in error, I imagine someone with deeper insight will chime in.

Mike

On 12/26/2016 01:30 PM, vincent delft wrote:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk
> and will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be
> Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
> filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
> should I create 2 slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ?
> should I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>

From dillon at backplane.com  Wed Dec 28 14:32:55 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Wed, 28 Dec 2016 14:32:55 -0800
Subject: Swap changes in master, enforcement of RLIMIT_RSS
Message-ID: <CAOZ7CpC5T=OeVR7-FP8f09D2giPEBde4fxboC4vemRisMxdgew@mail.gmail.com>

The maximum supported swap space has been significantly increased in
master.  The amount of core memory required to manage swap is approximately
1:128 (ram:swap-used).  The system allows up to half of your core ram to be
used for swap management so the maximum swap you can actually use on a
machine would be as follows:

With 1GB of ram it would be 512M*128 = 64GB of swap.

With 8GB of ram it would be 4G*128 = 512GB of swap.

With 128GB of ram it would be around ~8TB of swap.

The maximum the system supports due to KVM limitations is around ~32TB of
swap.

While you can configure more swap than the above limitations, the system
might not be able to actually manage as much as you configure.  I'm
considering improving the management code to reduce its memory
requirements.  Also, people shouldn't go overboard w/regards to configuring
swap space.  Configured swap does eat some memory resources even when not
actively in-use, and there might not be a practical use for having such
massive amounts of swap with your setup.

--

The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced
on a per-process basis.  By default this limit is infinity so nothing will
change for most people.  Setting a memoryuse limit has advantages in
certain circumstances as long as it isn't over-used.  For example, it might
be reasonable to start your browser with a 500m (500 megabyte) limit.  The
system will allow each process VSZ to grow larger, but will force the RSS
to the limit and devalue the pages it removes to make them more likely to
be recycled by the system.

There is a sysctl, vm.pageout_memuse_mode, to control the behavior.  It can
be set to 0, 1, or 2, and defaults to 1.  Setting it to 0 disables the
enforcement (system operates the same as it did before the commit).  The
default value of 1 enables enforcement by removing pages from the process
pmap and deactivating them passively.  The value 2 enables enforcement and
actively pages data out and frees pages.   Mode 2 can cause excessive
paging and wear out your SSD so it is not currently recommended.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161228/4ff8510c/attachment.html>

From t+dfbsd at timdarby.net  Thu Dec  1 08:27:40 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Thu, 1 Dec 2016 09:27:40 -0700
Subject: Notice to Golang users
Message-ID: <CAP3DVK+RYA_MjvSnLwwXD1FjsKbi1dJkipLCPS+kJXUTf-v08Q@mail.gmail.com>

If you use the Go compiler with Dragonfly (it's in dports), be aware that
the upcoming release of Go 1.8 will require DF 4.4.4 or later:

https://beta.golang.org/doc/go1.8

?The Go team had put a workaround in place for DF for a thread signal
handling bug. This was recently fixed by Matt, so they've removed the
workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you
can use Go 1.7 or earlier.

Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161201/6aa509a7/attachment-0002.html>

From venture37 at geeklan.co.uk  Sun Dec  4 07:56:48 2016
From: venture37 at geeklan.co.uk (Sevan Janiyan)
Date: Sun, 4 Dec 2016 15:56:48 +0000
Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?=
 =?UTF-8?Q?icipation_extended?=
Message-ID: <e0ee874b-d0a4-ff74-210c-4a913f653227@geeklan.co.uk>


Hi,

The BSD devroom move the submission deadline to Dec 10th.
There's still time to submit a talk !

Regards,
- rodrigo
On behalf of the BSD devroom

Fosdem 2017 BSD devroom Call for Participation
==============================================

Important dates
--------------------
* Conference date      : 4 & 5 February 2017 in Brussels, Belgium
* Devroom date         : Sunday 5 February 2016
* Submission deadline  : Sunday 27 November 2016
* Speaker notified     : Sunday 4 December 2016

The topic of the devroom includes all BSD operating systems and every
talk is welcome from hacker discussions to real-word examples and
presentations about new and shiny features.

Practical
--------------------
* The default duration for talks will be 45 minutes including
   discussion. Feel free to ask if you want to have a longer or
   a shorter slot.

* Presentations can be recorded and streamed, sending your proposal
   implies giving permission to be recorded. However, exceptions can
   be made for exceptional circumstances.


To submit your proposal, visit :
https://penta.fosdem.org/submission/FOSDEM17

Account creation
----------------------
If you already have a Pentabarf account, please *don't* recreate a new
one. If you forgot your password, reset it. If not, follow the
instructions to create an account.

Submit a talk
----------------------
Create an ?event? and click on "Show all" in the top right corner
to display the full form.

Your submission must include the following information

* The title and subtitle of your talk (please be descriptive, as titles
   will be listed with ~500 from other projects)
* select ?BSD devroom? as the track.
* A short abstract of one paragraph
* A longer description if you wish to do so
* Links to related websites/blogs etc.

- Rodrigo Osorio
On behalf of the BSD Devroom


From ipc at peercorpstrust.org  Sat Dec 10 03:19:14 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 13:19:14 +0200
Subject: Parallel compression not using all available CPU
Message-ID: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>

Hi,

I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does.

When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top.

This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET 2016.

Has anyone else possibly observed this?

-- 
Mike


From justin at shiningsilence.com  Sat Dec 10 12:26:32 2016
From: justin at shiningsilence.com (Justin Sherrill)
Date: Sat, 10 Dec 2016 15:26:32 -0500
Subject: Parallel compression not using all available CPU
In-Reply-To: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
Message-ID: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>

On the two DragonFly systems, was it Hammer or UFS?  I would be
surprised if that made a difference, but it might?

On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
<ipc at peercorpstrust.org> wrote:
> Hi,
>
> I've observed that parallel compression tools such as pixz and lbzip2 do not
> make use of all of the available CPU under Dragonfly. On other OSes, it
> does.
>
> When testing on a 50 gb file, using top I've observed that CPU idle
> percentages consistently hover around the 90% range for pixz and ~70% for
> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
> compression is complete. Correspondingly, compression takes significantly
> longer under Dragonfly, so the CPU is really being under utilized in this
> case as opposed to erroneous reporting by top.
>
> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
> 2016.
>
> Has anyone else possibly observed this?
>
> --
> Mike
>


From ipc at peercorpstrust.org  Sat Dec 10 13:14:26 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 23:14:26 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>

Hi,

On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.


On 12/10/2016 10:26 PM, Justin Sherrill wrote:
> On the two DragonFly systems, was it Hammer or UFS?  I would be
> surprised if that made a difference, but it might?
>
> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
> <ipc at peercorpstrust.org> wrote:
>> Hi,
>>
>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>> make use of all of the available CPU under Dragonfly. On other OSes, it
>> does.
>>
>> When testing on a 50 gb file, using top I've observed that CPU idle
>> percentages consistently hover around the 90% range for pixz and ~70% for
>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>> compression is complete. Correspondingly, compression takes significantly
>> longer under Dragonfly, so the CPU is really being under utilized in this
>> case as opposed to erroneous reporting by top.
>>
>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>> 2016.
>>
>> Has anyone else possibly observed this?
>>
>> --
>> Mike
>>
>


From dillon at apollo.backplane.com  Sat Dec 10 13:29:24 2016
From: dillon at apollo.backplane.com (Matthew Dillon)
Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST)
Subject: Parallel compression not using all available CPU
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com>


:Hi,
:
:On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.

    Generally speaking I wouldn't want these utilities to use all available
    cpu threads by default.  It's not usually a good idea for a program to
    just assume that all the cpus are there for it (by default).  But they
    should have command line options that allow the number of threads to
    be specified.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


From jasse at yberwaffe.com  Sat Dec 10 16:00:02 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Sun, 11 Dec 2016 01:00:02 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>

Have you tried to disable hypertreads in the BIOS ???
It's a long shot, I know, but it might help.

On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
> Hi,
>
> On both systems HAMMER was used. One small correction concerning the 
> 2c/2t machine, both compression programs did effectively utilize that 
> CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t 
> where the CPU isn't effectively maxed out. I'll continue to try and 
> investigate why and report back if I find anything.
>
>
> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>> surprised if that made a difference, but it might?
>>
>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>> <ipc at peercorpstrust.org> wrote:
>>> Hi,
>>>
>>> I've observed that parallel compression tools such as pixz and 
>>> lbzip2 do not
>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>> does.
>>>
>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>> percentages consistently hover around the 90% range for pixz and 
>>> ~70% for
>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>> idle until
>>> compression is complete. Correspondingly, compression takes 
>>> significantly
>>> longer under Dragonfly, so the CPU is really being under utilized in 
>>> this
>>> case as opposed to erroneous reporting by top.
>>>
>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a 
>>> recent
>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>> 11:44:04 EET
>>> 2016.
>>>
>>> Has anyone else possibly observed this?
>>>
>>> -- 
>>> Mike
>>>
>>
>


From ipc at peercorpstrust.org  Sun Dec 11 13:22:44 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sun, 11 Dec 2016 23:22:44 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>

Hi,

It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk.

I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU:

########################################################################

"bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache.

Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2.

Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time.

Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB."

########################################################################

Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%).

Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made.


On 12/11/2016 02:00 AM, Jasse Jansson wrote:
> Have you tried to disable hypertreads in the BIOS ???
> It's a long shot, I know, but it might help.
>
> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>> Hi,
>>
>> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.
>>
>>
>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>> surprised if that made a difference, but it might?
>>>
>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>> <ipc at peercorpstrust.org> wrote:
>>>> Hi,
>>>>
>>>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>> does.
>>>>
>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>> percentages consistently hover around the 90% range for pixz and ~70% for
>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>>>> compression is complete. Correspondingly, compression takes significantly
>>>> longer under Dragonfly, so the CPU is really being under utilized in this
>>>> case as opposed to erroneous reporting by top.
>>>>
>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>>>> 2016.
>>>>
>>>> Has anyone else possibly observed this?
>>>>
>>>> --
>>>> Mike
>>>>
>>>
>>
>
>


From dillon at backplane.com  Sun Dec 11 15:37:36 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sun, 11 Dec 2016 15:37:36 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
Message-ID: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>

That doesn't make any sense.  It sounds like it is just compressing more
slowly, so there is less idle time because the HDD/SSD is able to keep up
due to it compressing more slowly.  You don't want to turn off
hyperthreading in the BIOS and cache coherency stalls will not show up in
the idle% anyway.

-Matt

On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
ipc at peercorpstrust.org> wrote:

> Hi,
>
> It turns out that it was a combination of two things - turning off
> hyperthreading in BIOS and using a faster disk.
>
> I found a post from the author of lbzip2 which seems to describe what
> might be happening in this case, but reference was made to a user using an
> i5 mobile CPU:
>
> ########################################################################
>
> "bzip2 author here. I strongly suspect that you see what you see because
> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
> quad-core. Meaning, you have two instances of the L2 per-core cache, not
> four, and each two hyperthreads share an L2 cache.
>
> Since the bzip2 compression/decompression is very cache sensitive (see
> "man bzip2"), the scaling factor will be determined mostly by how many
> OS-threads can dispose over a dedicated cache each. In your case this
> number is probably 2.
>
> Since you run two threads per core, those contend for the shared L2 cache,
> basically each messing with the other (flushing / invalidating the shared
> cache for the other). This contention shows up as double CPU time, because
> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
> time.
>
> Hyperthreading is not useful but detrimental for lbzip2; so you should
> export LBZIP2="-n 2". You should not run more worker threads per core than:
> core-dedicated-cache-size divided by 8MB."
>
> ########################################################################
>
> Running the compression again on the same file from an SSD with
> hyperthreading turned off, I was able to fully saturate all of the cores
> using lbzip2. None of this seemed obvious at first, but it rectified the
> situation. The biggest difference came from turning off hyperthreading
> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
> hyperthreading turned off (idle CPU = 0%).
>
> Previously, the compression was run from a single HDD, not an SSD.
> Concerning the compression test using the same HDD under FreeBSD, well I
> don't know why it was able to saturate the CPU. Perhaps it has something to
> do with ZFS's aggressive caching. Turning that off and re-running the test
> would likely answer the question. Pixz performed similarly when the above
> two modifications were made.
>
>
>
>
> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>
>> Have you tried to disable hypertreads in the BIOS ???
>> It's a long shot, I know, but it might help.
>>
>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>
>>> Hi,
>>>
>>> On both systems HAMMER was used. One small correction concerning the
>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>> report back if I find anything.
>>>
>>>
>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>
>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>> surprised if that made a difference, but it might?
>>>>
>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>> <ipc at peercorpstrust.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>> do not
>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>> does.
>>>>>
>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>> for
>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>> until
>>>>> compression is complete. Correspondingly, compression takes
>>>>> significantly
>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>> this
>>>>> case as opposed to erroneous reporting by top.
>>>>>
>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>> recent
>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>> EET
>>>>> 2016.
>>>>>
>>>>> Has anyone else possibly observed this?
>>>>>
>>>>> --
>>>>> Mike
>>>>>
>>>>>
>>>>
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161211/dbbb00e9/attachment-0002.html>

From ipc at peercorpstrust.org  Sun Dec 11 22:14:29 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 12 Dec 2016 08:14:29 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>

Thanks for this.

Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on?

On 12/12/2016 01:37 AM, Matthew Dillon wrote:
> That doesn't make any sense.  It sounds like it is just compressing more
> slowly, so there is less idle time because the HDD/SSD is able to keep up
> due to it compressing more slowly.  You don't want to turn off
> hyperthreading in the BIOS and cache coherency stalls will not show up in
> the idle% anyway.
>
> -Matt
>
> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
> ipc at peercorpstrust.org> wrote:
>
>> Hi,
>>
>> It turns out that it was a combination of two things - turning off
>> hyperthreading in BIOS and using a faster disk.
>>
>> I found a post from the author of lbzip2 which seems to describe what
>> might be happening in this case, but reference was made to a user using an
>> i5 mobile CPU:
>>
>> ########################################################################
>>
>> "bzip2 author here. I strongly suspect that you see what you see because
>> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
>> quad-core. Meaning, you have two instances of the L2 per-core cache, not
>> four, and each two hyperthreads share an L2 cache.
>>
>> Since the bzip2 compression/decompression is very cache sensitive (see
>> "man bzip2"), the scaling factor will be determined mostly by how many
>> OS-threads can dispose over a dedicated cache each. In your case this
>> number is probably 2.
>>
>> Since you run two threads per core, those contend for the shared L2 cache,
>> basically each messing with the other (flushing / invalidating the shared
>> cache for the other). This contention shows up as double CPU time, because
>> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
>> time.
>>
>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>> export LBZIP2="-n 2". You should not run more worker threads per core than:
>> core-dedicated-cache-size divided by 8MB."
>>
>> ########################################################################
>>
>> Running the compression again on the same file from an SSD with
>> hyperthreading turned off, I was able to fully saturate all of the cores
>> using lbzip2. None of this seemed obvious at first, but it rectified the
>> situation. The biggest difference came from turning off hyperthreading
>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>> hyperthreading turned off (idle CPU = 0%).
>>
>> Previously, the compression was run from a single HDD, not an SSD.
>> Concerning the compression test using the same HDD under FreeBSD, well I
>> don't know why it was able to saturate the CPU. Perhaps it has something to
>> do with ZFS's aggressive caching. Turning that off and re-running the test
>> would likely answer the question. Pixz performed similarly when the above
>> two modifications were made.
>>
>>
>>
>>
>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>
>>> Have you tried to disable hypertreads in the BIOS ???
>>> It's a long shot, I know, but it might help.
>>>
>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>
>>>> Hi,
>>>>
>>>> On both systems HAMMER was used. One small correction concerning the
>>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>>> report back if I find anything.
>>>>
>>>>
>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>
>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>> surprised if that made a difference, but it might?
>>>>>
>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>>> do not
>>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>>> does.
>>>>>>
>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>>> for
>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>>> until
>>>>>> compression is complete. Correspondingly, compression takes
>>>>>> significantly
>>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>>> this
>>>>>> case as opposed to erroneous reporting by top.
>>>>>>
>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>> recent
>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>>> EET
>>>>>> 2016.
>>>>>>
>>>>>> Has anyone else possibly observed this?
>>>>>>
>>>>>> --
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>


From jasse at yberwaffe.com  Mon Dec 12 05:41:40 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Mon, 12 Dec 2016 14:41:40 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>

They recommend to turn hyperthreading off if you run studio software on 
your computer.
That's if you run Windows, I have no idea if HT affects a Unix 
derivative anyway.

On 2016-12-12 07:14, PeerCorps Trust Fund wrote:
> Thanks for this.
>
> Is there ever a side case where hyperthreading might have 
> unpredictable results or should it generally always be left on?
>
> On 12/12/2016 01:37 AM, Matthew Dillon wrote:
>> That doesn't make any sense.  It sounds like it is just compressing more
>> slowly, so there is less idle time because the HDD/SSD is able to 
>> keep up
>> due to it compressing more slowly.  You don't want to turn off
>> hyperthreading in the BIOS and cache coherency stalls will not show 
>> up in
>> the idle% anyway.
>>
>> -Matt
>>
>> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
>> ipc at peercorpstrust.org> wrote:
>>
>>> Hi,
>>>
>>> It turns out that it was a combination of two things - turning off
>>> hyperthreading in BIOS and using a faster disk.
>>>
>>> I found a post from the author of lbzip2 which seems to describe what
>>> might be happening in this case, but reference was made to a user 
>>> using an
>>> i5 mobile CPU:
>>>
>>> ######################################################################## 
>>>
>>>
>>> "bzip2 author here. I strongly suspect that you see what you see 
>>> because
>>> your Intel core i5 is probably only dual core PLUS hyper-threaded, 
>>> not real
>>> quad-core. Meaning, you have two instances of the L2 per-core cache, 
>>> not
>>> four, and each two hyperthreads share an L2 cache.
>>>
>>> Since the bzip2 compression/decompression is very cache sensitive (see
>>> "man bzip2"), the scaling factor will be determined mostly by how many
>>> OS-threads can dispose over a dedicated cache each. In your case this
>>> number is probably 2.
>>>
>>> Since you run two threads per core, those contend for the shared L2 
>>> cache,
>>> basically each messing with the other (flushing / invalidating the 
>>> shared
>>> cache for the other). This contention shows up as double CPU time, 
>>> because
>>> "waiting for cache" (or "waiting for main memory") is accounted for 
>>> as CPU
>>> time.
>>>
>>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>>> export LBZIP2="-n 2". You should not run more worker threads per 
>>> core than:
>>> core-dedicated-cache-size divided by 8MB."
>>>
>>> ######################################################################## 
>>>
>>>
>>> Running the compression again on the same file from an SSD with
>>> hyperthreading turned off, I was able to fully saturate all of the 
>>> cores
>>> using lbzip2. None of this seemed obvious at first, but it rectified 
>>> the
>>> situation. The biggest difference came from turning off hyperthreading
>>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>>> hyperthreading turned off (idle CPU = 0%).
>>>
>>> Previously, the compression was run from a single HDD, not an SSD.
>>> Concerning the compression test using the same HDD under FreeBSD, 
>>> well I
>>> don't know why it was able to saturate the CPU. Perhaps it has 
>>> something to
>>> do with ZFS's aggressive caching. Turning that off and re-running 
>>> the test
>>> would likely answer the question. Pixz performed similarly when the 
>>> above
>>> two modifications were made.
>>>
>>>
>>>
>>>
>>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>>
>>>> Have you tried to disable hypertreads in the BIOS ???
>>>> It's a long shot, I know, but it might help.
>>>>
>>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> On both systems HAMMER was used. One small correction concerning the
>>>>> 2c/2t machine, both compression programs did effectively utilize 
>>>>> that CPU
>>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t 
>>>>> where the CPU
>>>>> isn't effectively maxed out. I'll continue to try and investigate 
>>>>> why and
>>>>> report back if I find anything.
>>>>>
>>>>>
>>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>>
>>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>>> surprised if that made a difference, but it might?
>>>>>>
>>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've observed that parallel compression tools such as pixz and 
>>>>>>> lbzip2
>>>>>>> do not
>>>>>>> make use of all of the available CPU under Dragonfly. On other 
>>>>>>> OSes, it
>>>>>>> does.
>>>>>>>
>>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>>> percentages consistently hover around the 90% range for pixz and 
>>>>>>> ~70%
>>>>>>> for
>>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>>>>>> idle
>>>>>>> until
>>>>>>> compression is complete. Correspondingly, compression takes
>>>>>>> significantly
>>>>>>> longer under Dragonfly, so the CPU is really being under 
>>>>>>> utilized in
>>>>>>> this
>>>>>>> case as opposed to erroneous reporting by top.
>>>>>>>
>>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>>> recent
>>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>>>>>> 11:44:04
>>>>>>> EET
>>>>>>> 2016.
>>>>>>>
>>>>>>> Has anyone else possibly observed this?
>>>>>>>
>>>>>>> -- 
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>


From dillon at backplane.com  Tue Dec 13 12:56:41 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Tue, 13 Dec 2016 12:56:41 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
 <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
Message-ID: <CAOZ7CpDz_zKSr4OfQTQWcGApG=mNsqGv7gO4b90BJCzrohb5dg@mail.gmail.com>

Never turn hyper-threading off if running (any) BSD or Linux.  The
schedulers understand hyperthreads.

In terms of compression programs, presumably they have an option to specify
how many threads to create to do the compression.  Simply test with 1, 2,
and 4 threads to determine the optimal number.

-Matt
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161213/0eb2e7f6/attachment-0002.html>

From dillon at backplane.com  Sat Dec 17 12:10:35 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:10:35 -0800
Subject: Graphics-related memory leak fixed in master for post-haswell machines
Message-ID: <CAOZ7CpDurD7AD-3E5sGnZBBCOPLt+0PPoDg=H6mV7EXivCT8Kw@mail.gmail.com>

A memory leak has been fixed in master's drm (gpu) code which affects
post-haswell intel GPUs.  i.e. Broadwell, Skylake, etc.  This leak would
cause a OOM panic typically after a couple of weeks of uptime.  The leak
does not affect older Haswell based intel systems.  A number of other
stability improvements have also gone into the system.

The release branch got some of the stability work but the fix for the
graphics memory leak is a bit too complex to backport in a compatible
manner.  At the current juncture, anyone running a workstation with release
might want to switch to master.  Master will be *much* better on broadwell
or later GPUs.

Also somewhat related, chromium (chrome) has been updated from v52 to v54
on both branches, which fixes some serious GUI issues on high-dpi (aka 4K)
monitors.  Chrome 54 still has some issues with certain sites, but
generally speaking it seems to be pretty stable and I recommend its use.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/1da04d0a/attachment-0002.html>

From dillon at backplane.com  Sat Dec 17 12:19:31 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:19:31 -0800
Subject: Setting up a more secure browser environment
Message-ID: <CAOZ7CpAYYTK_QsAfQ4vEKT6BLSnBtm04+88XJ=NPemwoW187Pw@mail.gmail.com>

I have created a new page in the handbook describing how to set up a secure
browser environment using isolation accounts.  We recommend running your
browser this way instead of running it directly from your main workstation
account.

https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/

Currently chrome is considered to be the most secure.  Now that chrome 54
is in dports, it should generate a pretty good experience for users.  There
are some sites which still implode on it (and in FreeBSD too, that isn't a
DFly-specific problem), and for those you may have to run firefox.  My
personal preference is chrome with some adjustments to the setup.

For people running chrome on a 4K monitor, I recommend running chrome with
the "--force-device-scale-factor=1.5" option before you start messing with
font or other settings.  It will scale the UI a little closer to what you
probably want.

DragonFly master's support through Intel Skylake is now very good.  I am
running a Skylake box as my workstation now and it has been very stable.
Theoretically we have some Karby support in there too but no guarantees on
stability.  Radeon support has also seen some improvement though not at the
same pace as Intel GPU support has seen.

At this point, DragonFly runs extremely well even on mobile chipsets
(though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in
terms of performance).  I recommend a minimum of 4GB of ram if using X.
DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or
NVMe SSD and a large swap partition to page to for decent browser
performance.  Browsers are still big memory hogs.  Eventually we will have
chrome 55 support which is supposed to eat much less memory, but not yet.
 chrome 54 is where we are now.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/e050f279/attachment-0002.html>

From karu.pruun at gmail.com  Mon Dec 19 05:21:56 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Mon, 19 Dec 2016 15:21:56 +0200
Subject: question about hammer mirror-stream
Message-ID: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>

Hello

A quick question regarding hammer mirror-streaming. The master and slave
are mounted as

# mount_null /pfs/master /home/master
# mount_null /pfs/slave /home/slave

and streaming is ongoing with

# hammer mirror-stream /home/master /home/slave

When there's a change in master, e.g.

% sudo touch /home/master/fileX

then fileX will show up in /pfs/slave but not in /home/slave. It will show
up in the latter when I stop streaming, unmount the slave and mount it
again. Is this the intended behavior?

Thanks

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/e2caa040/attachment-0002.html>

From t+dfbsd at timdarby.net  Mon Dec 19 08:50:06 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Mon, 19 Dec 2016 09:50:06 -0700
Subject: question about hammer mirror-stream
In-Reply-To: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
Message-ID: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:

> then fileX will show up in /pfs/slave but not in /home/slave. It will show
> up in the latter when I stop streaming, unmount the slave and mount it
> again. Is this the intended behavior?
>

Yes, that's how it works. You won't see the updates to the mounted slave
until you unmount and re-mount.

Tim?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/07448b6c/attachment-0002.html>

From dillon at backplane.com  Mon Dec 19 10:35:29 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Mon, 19 Dec 2016 10:35:29 -0800
Subject: AHCI driver on master - heads up
Message-ID: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>

Some work has gone into the AHCI driver on master.  It should not affect
anyone, but I'm posting a heads-up because this is the primary SATA storage
device used on most machines and the changes made are significant enough
that they could conceivably cause an issue.  So if you are using master, be
sure you have a good backup kernel in case something goes wrong.

The easiest way to create a backup kernel is to first install the new
kernel.  Your previous kernel will be stripped and copied to
/boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:

cpdup /boot/kernel.old /boot/kernel.bak

This way you don't waste space in /boot copying kernels and modules with
full debug symbols.  Having a kernel.bak is a good idea because it will
never be overwritten by an installkernel or other build target.  It will
show up in the loader under 'b' for backup kernel.

--

Experimental FIS-Based Switching support has been added to the AHCI
driver's port-multiplier code.  Most AHCI chipsets do not have this
capability and the few that do tend to have fairly buggy hardware, so I am
not recommending that port multipliers be used generally.  However, after
years of waiting I finally got my hands on a mobo that had the feature and
I've wanted to implement it forever, so I have done so.

The FBS feature allows the computer to queue concurrent commands to
multiple targets behind a port-multiplier.  Without it, only one target can
be addressed at a time.  Test results with some old HDDs I had on a shelf
can be found here.  As expected, concurrent performance to multiple targets
is vastly improved.

http://apollo.backplane.com/DFlyMisc/portmult01.txt

However, I must again stress that AHCI-based port-multipliers tend to
fall-over when errors occur, or even if you are just hot-swapping a drive
sometimes.  Using a port-multiplier is not a good solution for serious
setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
in order to prevent it from competing with commercial SAS controllers.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/ba061b2e/attachment-0002.html>

From ipc at peercorpstrust.org  Mon Dec 19 11:14:07 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 19 Dec 2016 21:14:07 +0200
Subject: AHCI driver on master - heads up
In-Reply-To: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
References: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org>

This is interesting.

AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you!


On 12/19/2016 08:35 PM, Matthew Dillon wrote:
> Some work has gone into the AHCI driver on master.  It should not affect
> anyone, but I'm posting a heads-up because this is the primary SATA storage
> device used on most machines and the changes made are significant enough
> that they could conceivably cause an issue.  So if you are using master, be
> sure you have a good backup kernel in case something goes wrong.
>
> The easiest way to create a backup kernel is to first install the new
> kernel.  Your previous kernel will be stripped and copied to
> /boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:
>
> cpdup /boot/kernel.old /boot/kernel.bak
>
> This way you don't waste space in /boot copying kernels and modules with
> full debug symbols.  Having a kernel.bak is a good idea because it will
> never be overwritten by an installkernel or other build target.  It will
> show up in the loader under 'b' for backup kernel.
>
> --
>
> Experimental FIS-Based Switching support has been added to the AHCI
> driver's port-multiplier code.  Most AHCI chipsets do not have this
> capability and the few that do tend to have fairly buggy hardware, so I am
> not recommending that port multipliers be used generally.  However, after
> years of waiting I finally got my hands on a mobo that had the feature and
> I've wanted to implement it forever, so I have done so.
>
> The FBS feature allows the computer to queue concurrent commands to
> multiple targets behind a port-multiplier.  Without it, only one target can
> be addressed at a time.  Test results with some old HDDs I had on a shelf
> can be found here.  As expected, concurrent performance to multiple targets
> is vastly improved.
>
> http://apollo.backplane.com/DFlyMisc/portmult01.txt
>
> However, I must again stress that AHCI-based port-multipliers tend to
> fall-over when errors occur, or even if you are just hot-swapping a drive
> sometimes.  Using a port-multiplier is not a good solution for serious
> setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
> in order to prevent it from competing with commercial SAS controllers.
>
> -Matt
>


From karu.pruun at gmail.com  Thu Dec 22 05:11:17 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Thu, 22 Dec 2016 15:11:17 +0200
Subject: question about hammer mirror-stream
In-Reply-To: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
 <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
Message-ID: <CADdF=MJsa-aDAGW6p2Fbx_ryD0TanKXASmUANDjXG5hVrEsc8w@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby <t+dfbsd at timdarby.net> wrote:

> On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:
>
>> then fileX will show up in /pfs/slave but not in /home/slave. It will
>> show up in the latter when I stop streaming, unmount the slave and mount it
>> again. Is this the intended behavior?
>>
>
> Yes, that's how it works. You won't see the updates to the mounted slave
> until you unmount and re-mount.
>


Thanks!

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161222/58827d29/attachment-0002.html>

From vincent.delft at gmail.com  Mon Dec 26 03:30:12 2016
From: vincent.delft at gmail.com (vincent delft)
Date: Mon, 26 Dec 2016 12:30:12 +0100
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
Message-ID: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>

Hello,

First of all I'm new to DF.
Thanks for the nice install process. It's a quite easy to have DF running.
But my concerns are more linked to HammerFS.

My context is the following:
- I would like to have a NAS system running on a small machine: 4GB RAM
having a celeron CPU of 1.6GHz.
- This NAS host +- 700GB of data
- I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
outside the NAS for DRP reasons.
- I'm looking for a solution tackling the bit-rot problem.


Concerning HammerFS, I've still read lot of DF manuals, pages, ... :

http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
https://www.dragonflybsd.org/hammer/

So basically, I've understood that I must install DF on a small SSD disk
and will dedicated the 1TB disk to HammerFS.

But I have some questions:
- What must be the setup of HammerFS to solve "bit-rot" ? Should it be
Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
should I create 2 slices on the disk?

- For DRP reasons, how can I perform a backup on an external disk ? (the
disk is only connected 1x per month on the NAS). Should I do a "dd" ?
should I use the master-slave concepts ? Should I use cpdup ?

Many thanks for your replies.

Vincent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161226/81f21ada/attachment-0002.html>

From kusumi.tomohiro at gmail.com  Mon Dec 26 04:15:08 2016
From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi)
Date: Mon, 26 Dec 2016 21:15:08 +0900
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <CAHndrBnCMCQ+3vb4HP2Ey+JEMfE_eGvZOKJ8UuObLA2rfg7B8A@mail.gmail.com>

In short, hammer doesn't solve (or fix) what you call bit-rot.


2016-12-26 20:30 GMT+09:00 vincent delft <vincent.delft at gmail.com>:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk and
> will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1
> or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ?
> Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2
> slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ? should
> I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>
>


From ipc at peercorpstrust.org  Mon Dec 26 05:06:24 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 26 Dec 2016 15:06:24 +0200
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org>

Hi Vincent,

Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine.

Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on.

HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication.

Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company:
https://bsdmag.org/siju_george/

HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs.

There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer

You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/

In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also.

If any of the above is in error, I imagine someone with deeper insight will chime in.

Mike

On 12/26/2016 01:30 PM, vincent delft wrote:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk
> and will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be
> Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
> filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
> should I create 2 slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ?
> should I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>


From dillon at backplane.com  Wed Dec 28 14:32:55 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Wed, 28 Dec 2016 14:32:55 -0800
Subject: Swap changes in master, enforcement of RLIMIT_RSS
Message-ID: <CAOZ7CpC5T=OeVR7-FP8f09D2giPEBde4fxboC4vemRisMxdgew@mail.gmail.com>

The maximum supported swap space has been significantly increased in
master.  The amount of core memory required to manage swap is approximately
1:128 (ram:swap-used).  The system allows up to half of your core ram to be
used for swap management so the maximum swap you can actually use on a
machine would be as follows:

With 1GB of ram it would be 512M*128 = 64GB of swap.

With 8GB of ram it would be 4G*128 = 512GB of swap.

With 128GB of ram it would be around ~8TB of swap.

The maximum the system supports due to KVM limitations is around ~32TB of
swap.

While you can configure more swap than the above limitations, the system
might not be able to actually manage as much as you configure.  I'm
considering improving the management code to reduce its memory
requirements.  Also, people shouldn't go overboard w/regards to configuring
swap space.  Configured swap does eat some memory resources even when not
actively in-use, and there might not be a practical use for having such
massive amounts of swap with your setup.

--

The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced
on a per-process basis.  By default this limit is infinity so nothing will
change for most people.  Setting a memoryuse limit has advantages in
certain circumstances as long as it isn't over-used.  For example, it might
be reasonable to start your browser with a 500m (500 megabyte) limit.  The
system will allow each process VSZ to grow larger, but will force the RSS
to the limit and devalue the pages it removes to make them more likely to
be recycled by the system.

There is a sysctl, vm.pageout_memuse_mode, to control the behavior.  It can
be set to 0, 1, or 2, and defaults to 1.  Setting it to 0 disables the
enforcement (system operates the same as it did before the commit).  The
default value of 1 enables enforcement by removing pages from the process
pmap and deactivating them passively.  The value 2 enables enforcement and
actively pages data out and frees pages.   Mode 2 can cause excessive
paging and wear out your SSD so it is not currently recommended.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161228/4ff8510c/attachment-0002.html>

From t+dfbsd at timdarby.net  Thu Dec  1 08:27:40 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Thu, 1 Dec 2016 09:27:40 -0700
Subject: Notice to Golang users
Message-ID: <CAP3DVK+RYA_MjvSnLwwXD1FjsKbi1dJkipLCPS+kJXUTf-v08Q@mail.gmail.com>

If you use the Go compiler with Dragonfly (it's in dports), be aware that
the upcoming release of Go 1.8 will require DF 4.4.4 or later:

https://beta.golang.org/doc/go1.8

?The Go team had put a workaround in place for DF for a thread signal
handling bug. This was recently fixed by Matt, so they've removed the
workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you
can use Go 1.7 or earlier.

Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161201/6aa509a7/attachment-0003.html>

From venture37 at geeklan.co.uk  Sun Dec  4 07:56:48 2016
From: venture37 at geeklan.co.uk (Sevan Janiyan)
Date: Sun, 4 Dec 2016 15:56:48 +0000
Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?=
 =?UTF-8?Q?icipation_extended?=
Message-ID: <e0ee874b-d0a4-ff74-210c-4a913f653227@geeklan.co.uk>


Hi,

The BSD devroom move the submission deadline to Dec 10th.
There's still time to submit a talk !

Regards,
- rodrigo
On behalf of the BSD devroom

Fosdem 2017 BSD devroom Call for Participation
==============================================

Important dates
--------------------
* Conference date      : 4 & 5 February 2017 in Brussels, Belgium
* Devroom date         : Sunday 5 February 2016
* Submission deadline  : Sunday 27 November 2016
* Speaker notified     : Sunday 4 December 2016

The topic of the devroom includes all BSD operating systems and every
talk is welcome from hacker discussions to real-word examples and
presentations about new and shiny features.

Practical
--------------------
* The default duration for talks will be 45 minutes including
   discussion. Feel free to ask if you want to have a longer or
   a shorter slot.

* Presentations can be recorded and streamed, sending your proposal
   implies giving permission to be recorded. However, exceptions can
   be made for exceptional circumstances.


To submit your proposal, visit :
https://penta.fosdem.org/submission/FOSDEM17

Account creation
----------------------
If you already have a Pentabarf account, please *don't* recreate a new
one. If you forgot your password, reset it. If not, follow the
instructions to create an account.

Submit a talk
----------------------
Create an ?event? and click on "Show all" in the top right corner
to display the full form.

Your submission must include the following information

* The title and subtitle of your talk (please be descriptive, as titles
   will be listed with ~500 from other projects)
* select ?BSD devroom? as the track.
* A short abstract of one paragraph
* A longer description if you wish to do so
* Links to related websites/blogs etc.

- Rodrigo Osorio
On behalf of the BSD Devroom


From ipc at peercorpstrust.org  Sat Dec 10 03:19:14 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 13:19:14 +0200
Subject: Parallel compression not using all available CPU
Message-ID: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>

Hi,

I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does.

When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top.

This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET 2016.

Has anyone else possibly observed this?

-- 
Mike


From justin at shiningsilence.com  Sat Dec 10 12:26:32 2016
From: justin at shiningsilence.com (Justin Sherrill)
Date: Sat, 10 Dec 2016 15:26:32 -0500
Subject: Parallel compression not using all available CPU
In-Reply-To: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
Message-ID: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>

On the two DragonFly systems, was it Hammer or UFS?  I would be
surprised if that made a difference, but it might?

On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
<ipc at peercorpstrust.org> wrote:
> Hi,
>
> I've observed that parallel compression tools such as pixz and lbzip2 do not
> make use of all of the available CPU under Dragonfly. On other OSes, it
> does.
>
> When testing on a 50 gb file, using top I've observed that CPU idle
> percentages consistently hover around the 90% range for pixz and ~70% for
> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
> compression is complete. Correspondingly, compression takes significantly
> longer under Dragonfly, so the CPU is really being under utilized in this
> case as opposed to erroneous reporting by top.
>
> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
> 2016.
>
> Has anyone else possibly observed this?
>
> --
> Mike
>


From ipc at peercorpstrust.org  Sat Dec 10 13:14:26 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 23:14:26 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>

Hi,

On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.


On 12/10/2016 10:26 PM, Justin Sherrill wrote:
> On the two DragonFly systems, was it Hammer or UFS?  I would be
> surprised if that made a difference, but it might?
>
> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
> <ipc at peercorpstrust.org> wrote:
>> Hi,
>>
>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>> make use of all of the available CPU under Dragonfly. On other OSes, it
>> does.
>>
>> When testing on a 50 gb file, using top I've observed that CPU idle
>> percentages consistently hover around the 90% range for pixz and ~70% for
>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>> compression is complete. Correspondingly, compression takes significantly
>> longer under Dragonfly, so the CPU is really being under utilized in this
>> case as opposed to erroneous reporting by top.
>>
>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>> 2016.
>>
>> Has anyone else possibly observed this?
>>
>> --
>> Mike
>>
>


From dillon at apollo.backplane.com  Sat Dec 10 13:29:24 2016
From: dillon at apollo.backplane.com (Matthew Dillon)
Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST)
Subject: Parallel compression not using all available CPU
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com>


:Hi,
:
:On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.

    Generally speaking I wouldn't want these utilities to use all available
    cpu threads by default.  It's not usually a good idea for a program to
    just assume that all the cpus are there for it (by default).  But they
    should have command line options that allow the number of threads to
    be specified.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


From jasse at yberwaffe.com  Sat Dec 10 16:00:02 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Sun, 11 Dec 2016 01:00:02 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>

Have you tried to disable hypertreads in the BIOS ???
It's a long shot, I know, but it might help.

On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
> Hi,
>
> On both systems HAMMER was used. One small correction concerning the 
> 2c/2t machine, both compression programs did effectively utilize that 
> CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t 
> where the CPU isn't effectively maxed out. I'll continue to try and 
> investigate why and report back if I find anything.
>
>
> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>> surprised if that made a difference, but it might?
>>
>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>> <ipc at peercorpstrust.org> wrote:
>>> Hi,
>>>
>>> I've observed that parallel compression tools such as pixz and 
>>> lbzip2 do not
>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>> does.
>>>
>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>> percentages consistently hover around the 90% range for pixz and 
>>> ~70% for
>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>> idle until
>>> compression is complete. Correspondingly, compression takes 
>>> significantly
>>> longer under Dragonfly, so the CPU is really being under utilized in 
>>> this
>>> case as opposed to erroneous reporting by top.
>>>
>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a 
>>> recent
>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>> 11:44:04 EET
>>> 2016.
>>>
>>> Has anyone else possibly observed this?
>>>
>>> -- 
>>> Mike
>>>
>>
>


From ipc at peercorpstrust.org  Sun Dec 11 13:22:44 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sun, 11 Dec 2016 23:22:44 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>

Hi,

It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk.

I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU:

########################################################################

"bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache.

Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2.

Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time.

Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB."

########################################################################

Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%).

Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made.


On 12/11/2016 02:00 AM, Jasse Jansson wrote:
> Have you tried to disable hypertreads in the BIOS ???
> It's a long shot, I know, but it might help.
>
> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>> Hi,
>>
>> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.
>>
>>
>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>> surprised if that made a difference, but it might?
>>>
>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>> <ipc at peercorpstrust.org> wrote:
>>>> Hi,
>>>>
>>>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>> does.
>>>>
>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>> percentages consistently hover around the 90% range for pixz and ~70% for
>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>>>> compression is complete. Correspondingly, compression takes significantly
>>>> longer under Dragonfly, so the CPU is really being under utilized in this
>>>> case as opposed to erroneous reporting by top.
>>>>
>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>>>> 2016.
>>>>
>>>> Has anyone else possibly observed this?
>>>>
>>>> --
>>>> Mike
>>>>
>>>
>>
>
>


From dillon at backplane.com  Sun Dec 11 15:37:36 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sun, 11 Dec 2016 15:37:36 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
Message-ID: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>

That doesn't make any sense.  It sounds like it is just compressing more
slowly, so there is less idle time because the HDD/SSD is able to keep up
due to it compressing more slowly.  You don't want to turn off
hyperthreading in the BIOS and cache coherency stalls will not show up in
the idle% anyway.

-Matt

On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
ipc at peercorpstrust.org> wrote:

> Hi,
>
> It turns out that it was a combination of two things - turning off
> hyperthreading in BIOS and using a faster disk.
>
> I found a post from the author of lbzip2 which seems to describe what
> might be happening in this case, but reference was made to a user using an
> i5 mobile CPU:
>
> ########################################################################
>
> "bzip2 author here. I strongly suspect that you see what you see because
> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
> quad-core. Meaning, you have two instances of the L2 per-core cache, not
> four, and each two hyperthreads share an L2 cache.
>
> Since the bzip2 compression/decompression is very cache sensitive (see
> "man bzip2"), the scaling factor will be determined mostly by how many
> OS-threads can dispose over a dedicated cache each. In your case this
> number is probably 2.
>
> Since you run two threads per core, those contend for the shared L2 cache,
> basically each messing with the other (flushing / invalidating the shared
> cache for the other). This contention shows up as double CPU time, because
> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
> time.
>
> Hyperthreading is not useful but detrimental for lbzip2; so you should
> export LBZIP2="-n 2". You should not run more worker threads per core than:
> core-dedicated-cache-size divided by 8MB."
>
> ########################################################################
>
> Running the compression again on the same file from an SSD with
> hyperthreading turned off, I was able to fully saturate all of the cores
> using lbzip2. None of this seemed obvious at first, but it rectified the
> situation. The biggest difference came from turning off hyperthreading
> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
> hyperthreading turned off (idle CPU = 0%).
>
> Previously, the compression was run from a single HDD, not an SSD.
> Concerning the compression test using the same HDD under FreeBSD, well I
> don't know why it was able to saturate the CPU. Perhaps it has something to
> do with ZFS's aggressive caching. Turning that off and re-running the test
> would likely answer the question. Pixz performed similarly when the above
> two modifications were made.
>
>
>
>
> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>
>> Have you tried to disable hypertreads in the BIOS ???
>> It's a long shot, I know, but it might help.
>>
>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>
>>> Hi,
>>>
>>> On both systems HAMMER was used. One small correction concerning the
>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>> report back if I find anything.
>>>
>>>
>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>
>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>> surprised if that made a difference, but it might?
>>>>
>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>> <ipc at peercorpstrust.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>> do not
>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>> does.
>>>>>
>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>> for
>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>> until
>>>>> compression is complete. Correspondingly, compression takes
>>>>> significantly
>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>> this
>>>>> case as opposed to erroneous reporting by top.
>>>>>
>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>> recent
>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>> EET
>>>>> 2016.
>>>>>
>>>>> Has anyone else possibly observed this?
>>>>>
>>>>> --
>>>>> Mike
>>>>>
>>>>>
>>>>
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161211/dbbb00e9/attachment-0003.html>

From ipc at peercorpstrust.org  Sun Dec 11 22:14:29 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 12 Dec 2016 08:14:29 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>

Thanks for this.

Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on?

On 12/12/2016 01:37 AM, Matthew Dillon wrote:
> That doesn't make any sense.  It sounds like it is just compressing more
> slowly, so there is less idle time because the HDD/SSD is able to keep up
> due to it compressing more slowly.  You don't want to turn off
> hyperthreading in the BIOS and cache coherency stalls will not show up in
> the idle% anyway.
>
> -Matt
>
> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
> ipc at peercorpstrust.org> wrote:
>
>> Hi,
>>
>> It turns out that it was a combination of two things - turning off
>> hyperthreading in BIOS and using a faster disk.
>>
>> I found a post from the author of lbzip2 which seems to describe what
>> might be happening in this case, but reference was made to a user using an
>> i5 mobile CPU:
>>
>> ########################################################################
>>
>> "bzip2 author here. I strongly suspect that you see what you see because
>> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
>> quad-core. Meaning, you have two instances of the L2 per-core cache, not
>> four, and each two hyperthreads share an L2 cache.
>>
>> Since the bzip2 compression/decompression is very cache sensitive (see
>> "man bzip2"), the scaling factor will be determined mostly by how many
>> OS-threads can dispose over a dedicated cache each. In your case this
>> number is probably 2.
>>
>> Since you run two threads per core, those contend for the shared L2 cache,
>> basically each messing with the other (flushing / invalidating the shared
>> cache for the other). This contention shows up as double CPU time, because
>> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
>> time.
>>
>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>> export LBZIP2="-n 2". You should not run more worker threads per core than:
>> core-dedicated-cache-size divided by 8MB."
>>
>> ########################################################################
>>
>> Running the compression again on the same file from an SSD with
>> hyperthreading turned off, I was able to fully saturate all of the cores
>> using lbzip2. None of this seemed obvious at first, but it rectified the
>> situation. The biggest difference came from turning off hyperthreading
>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>> hyperthreading turned off (idle CPU = 0%).
>>
>> Previously, the compression was run from a single HDD, not an SSD.
>> Concerning the compression test using the same HDD under FreeBSD, well I
>> don't know why it was able to saturate the CPU. Perhaps it has something to
>> do with ZFS's aggressive caching. Turning that off and re-running the test
>> would likely answer the question. Pixz performed similarly when the above
>> two modifications were made.
>>
>>
>>
>>
>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>
>>> Have you tried to disable hypertreads in the BIOS ???
>>> It's a long shot, I know, but it might help.
>>>
>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>
>>>> Hi,
>>>>
>>>> On both systems HAMMER was used. One small correction concerning the
>>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>>> report back if I find anything.
>>>>
>>>>
>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>
>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>> surprised if that made a difference, but it might?
>>>>>
>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>>> do not
>>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>>> does.
>>>>>>
>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>>> for
>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>>> until
>>>>>> compression is complete. Correspondingly, compression takes
>>>>>> significantly
>>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>>> this
>>>>>> case as opposed to erroneous reporting by top.
>>>>>>
>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>> recent
>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>>> EET
>>>>>> 2016.
>>>>>>
>>>>>> Has anyone else possibly observed this?
>>>>>>
>>>>>> --
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>


From jasse at yberwaffe.com  Mon Dec 12 05:41:40 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Mon, 12 Dec 2016 14:41:40 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>

They recommend to turn hyperthreading off if you run studio software on 
your computer.
That's if you run Windows, I have no idea if HT affects a Unix 
derivative anyway.

On 2016-12-12 07:14, PeerCorps Trust Fund wrote:
> Thanks for this.
>
> Is there ever a side case where hyperthreading might have 
> unpredictable results or should it generally always be left on?
>
> On 12/12/2016 01:37 AM, Matthew Dillon wrote:
>> That doesn't make any sense.  It sounds like it is just compressing more
>> slowly, so there is less idle time because the HDD/SSD is able to 
>> keep up
>> due to it compressing more slowly.  You don't want to turn off
>> hyperthreading in the BIOS and cache coherency stalls will not show 
>> up in
>> the idle% anyway.
>>
>> -Matt
>>
>> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
>> ipc at peercorpstrust.org> wrote:
>>
>>> Hi,
>>>
>>> It turns out that it was a combination of two things - turning off
>>> hyperthreading in BIOS and using a faster disk.
>>>
>>> I found a post from the author of lbzip2 which seems to describe what
>>> might be happening in this case, but reference was made to a user 
>>> using an
>>> i5 mobile CPU:
>>>
>>> ######################################################################## 
>>>
>>>
>>> "bzip2 author here. I strongly suspect that you see what you see 
>>> because
>>> your Intel core i5 is probably only dual core PLUS hyper-threaded, 
>>> not real
>>> quad-core. Meaning, you have two instances of the L2 per-core cache, 
>>> not
>>> four, and each two hyperthreads share an L2 cache.
>>>
>>> Since the bzip2 compression/decompression is very cache sensitive (see
>>> "man bzip2"), the scaling factor will be determined mostly by how many
>>> OS-threads can dispose over a dedicated cache each. In your case this
>>> number is probably 2.
>>>
>>> Since you run two threads per core, those contend for the shared L2 
>>> cache,
>>> basically each messing with the other (flushing / invalidating the 
>>> shared
>>> cache for the other). This contention shows up as double CPU time, 
>>> because
>>> "waiting for cache" (or "waiting for main memory") is accounted for 
>>> as CPU
>>> time.
>>>
>>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>>> export LBZIP2="-n 2". You should not run more worker threads per 
>>> core than:
>>> core-dedicated-cache-size divided by 8MB."
>>>
>>> ######################################################################## 
>>>
>>>
>>> Running the compression again on the same file from an SSD with
>>> hyperthreading turned off, I was able to fully saturate all of the 
>>> cores
>>> using lbzip2. None of this seemed obvious at first, but it rectified 
>>> the
>>> situation. The biggest difference came from turning off hyperthreading
>>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>>> hyperthreading turned off (idle CPU = 0%).
>>>
>>> Previously, the compression was run from a single HDD, not an SSD.
>>> Concerning the compression test using the same HDD under FreeBSD, 
>>> well I
>>> don't know why it was able to saturate the CPU. Perhaps it has 
>>> something to
>>> do with ZFS's aggressive caching. Turning that off and re-running 
>>> the test
>>> would likely answer the question. Pixz performed similarly when the 
>>> above
>>> two modifications were made.
>>>
>>>
>>>
>>>
>>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>>
>>>> Have you tried to disable hypertreads in the BIOS ???
>>>> It's a long shot, I know, but it might help.
>>>>
>>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> On both systems HAMMER was used. One small correction concerning the
>>>>> 2c/2t machine, both compression programs did effectively utilize 
>>>>> that CPU
>>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t 
>>>>> where the CPU
>>>>> isn't effectively maxed out. I'll continue to try and investigate 
>>>>> why and
>>>>> report back if I find anything.
>>>>>
>>>>>
>>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>>
>>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>>> surprised if that made a difference, but it might?
>>>>>>
>>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've observed that parallel compression tools such as pixz and 
>>>>>>> lbzip2
>>>>>>> do not
>>>>>>> make use of all of the available CPU under Dragonfly. On other 
>>>>>>> OSes, it
>>>>>>> does.
>>>>>>>
>>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>>> percentages consistently hover around the 90% range for pixz and 
>>>>>>> ~70%
>>>>>>> for
>>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>>>>>> idle
>>>>>>> until
>>>>>>> compression is complete. Correspondingly, compression takes
>>>>>>> significantly
>>>>>>> longer under Dragonfly, so the CPU is really being under 
>>>>>>> utilized in
>>>>>>> this
>>>>>>> case as opposed to erroneous reporting by top.
>>>>>>>
>>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>>> recent
>>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>>>>>> 11:44:04
>>>>>>> EET
>>>>>>> 2016.
>>>>>>>
>>>>>>> Has anyone else possibly observed this?
>>>>>>>
>>>>>>> -- 
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>


From dillon at backplane.com  Tue Dec 13 12:56:41 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Tue, 13 Dec 2016 12:56:41 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
 <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
Message-ID: <CAOZ7CpDz_zKSr4OfQTQWcGApG=mNsqGv7gO4b90BJCzrohb5dg@mail.gmail.com>

Never turn hyper-threading off if running (any) BSD or Linux.  The
schedulers understand hyperthreads.

In terms of compression programs, presumably they have an option to specify
how many threads to create to do the compression.  Simply test with 1, 2,
and 4 threads to determine the optimal number.

-Matt
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161213/0eb2e7f6/attachment-0003.html>

From dillon at backplane.com  Sat Dec 17 12:10:35 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:10:35 -0800
Subject: Graphics-related memory leak fixed in master for post-haswell machines
Message-ID: <CAOZ7CpDurD7AD-3E5sGnZBBCOPLt+0PPoDg=H6mV7EXivCT8Kw@mail.gmail.com>

A memory leak has been fixed in master's drm (gpu) code which affects
post-haswell intel GPUs.  i.e. Broadwell, Skylake, etc.  This leak would
cause a OOM panic typically after a couple of weeks of uptime.  The leak
does not affect older Haswell based intel systems.  A number of other
stability improvements have also gone into the system.

The release branch got some of the stability work but the fix for the
graphics memory leak is a bit too complex to backport in a compatible
manner.  At the current juncture, anyone running a workstation with release
might want to switch to master.  Master will be *much* better on broadwell
or later GPUs.

Also somewhat related, chromium (chrome) has been updated from v52 to v54
on both branches, which fixes some serious GUI issues on high-dpi (aka 4K)
monitors.  Chrome 54 still has some issues with certain sites, but
generally speaking it seems to be pretty stable and I recommend its use.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/1da04d0a/attachment-0003.html>

From dillon at backplane.com  Sat Dec 17 12:19:31 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:19:31 -0800
Subject: Setting up a more secure browser environment
Message-ID: <CAOZ7CpAYYTK_QsAfQ4vEKT6BLSnBtm04+88XJ=NPemwoW187Pw@mail.gmail.com>

I have created a new page in the handbook describing how to set up a secure
browser environment using isolation accounts.  We recommend running your
browser this way instead of running it directly from your main workstation
account.

https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/

Currently chrome is considered to be the most secure.  Now that chrome 54
is in dports, it should generate a pretty good experience for users.  There
are some sites which still implode on it (and in FreeBSD too, that isn't a
DFly-specific problem), and for those you may have to run firefox.  My
personal preference is chrome with some adjustments to the setup.

For people running chrome on a 4K monitor, I recommend running chrome with
the "--force-device-scale-factor=1.5" option before you start messing with
font or other settings.  It will scale the UI a little closer to what you
probably want.

DragonFly master's support through Intel Skylake is now very good.  I am
running a Skylake box as my workstation now and it has been very stable.
Theoretically we have some Karby support in there too but no guarantees on
stability.  Radeon support has also seen some improvement though not at the
same pace as Intel GPU support has seen.

At this point, DragonFly runs extremely well even on mobile chipsets
(though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in
terms of performance).  I recommend a minimum of 4GB of ram if using X.
DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or
NVMe SSD and a large swap partition to page to for decent browser
performance.  Browsers are still big memory hogs.  Eventually we will have
chrome 55 support which is supposed to eat much less memory, but not yet.
 chrome 54 is where we are now.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/e050f279/attachment-0003.html>

From karu.pruun at gmail.com  Mon Dec 19 05:21:56 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Mon, 19 Dec 2016 15:21:56 +0200
Subject: question about hammer mirror-stream
Message-ID: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>

Hello

A quick question regarding hammer mirror-streaming. The master and slave
are mounted as

# mount_null /pfs/master /home/master
# mount_null /pfs/slave /home/slave

and streaming is ongoing with

# hammer mirror-stream /home/master /home/slave

When there's a change in master, e.g.

% sudo touch /home/master/fileX

then fileX will show up in /pfs/slave but not in /home/slave. It will show
up in the latter when I stop streaming, unmount the slave and mount it
again. Is this the intended behavior?

Thanks

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/e2caa040/attachment-0003.html>

From t+dfbsd at timdarby.net  Mon Dec 19 08:50:06 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Mon, 19 Dec 2016 09:50:06 -0700
Subject: question about hammer mirror-stream
In-Reply-To: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
Message-ID: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:

> then fileX will show up in /pfs/slave but not in /home/slave. It will show
> up in the latter when I stop streaming, unmount the slave and mount it
> again. Is this the intended behavior?
>

Yes, that's how it works. You won't see the updates to the mounted slave
until you unmount and re-mount.

Tim?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/07448b6c/attachment-0003.html>

From dillon at backplane.com  Mon Dec 19 10:35:29 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Mon, 19 Dec 2016 10:35:29 -0800
Subject: AHCI driver on master - heads up
Message-ID: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>

Some work has gone into the AHCI driver on master.  It should not affect
anyone, but I'm posting a heads-up because this is the primary SATA storage
device used on most machines and the changes made are significant enough
that they could conceivably cause an issue.  So if you are using master, be
sure you have a good backup kernel in case something goes wrong.

The easiest way to create a backup kernel is to first install the new
kernel.  Your previous kernel will be stripped and copied to
/boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:

cpdup /boot/kernel.old /boot/kernel.bak

This way you don't waste space in /boot copying kernels and modules with
full debug symbols.  Having a kernel.bak is a good idea because it will
never be overwritten by an installkernel or other build target.  It will
show up in the loader under 'b' for backup kernel.

--

Experimental FIS-Based Switching support has been added to the AHCI
driver's port-multiplier code.  Most AHCI chipsets do not have this
capability and the few that do tend to have fairly buggy hardware, so I am
not recommending that port multipliers be used generally.  However, after
years of waiting I finally got my hands on a mobo that had the feature and
I've wanted to implement it forever, so I have done so.

The FBS feature allows the computer to queue concurrent commands to
multiple targets behind a port-multiplier.  Without it, only one target can
be addressed at a time.  Test results with some old HDDs I had on a shelf
can be found here.  As expected, concurrent performance to multiple targets
is vastly improved.

http://apollo.backplane.com/DFlyMisc/portmult01.txt

However, I must again stress that AHCI-based port-multipliers tend to
fall-over when errors occur, or even if you are just hot-swapping a drive
sometimes.  Using a port-multiplier is not a good solution for serious
setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
in order to prevent it from competing with commercial SAS controllers.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/ba061b2e/attachment-0003.html>

From ipc at peercorpstrust.org  Mon Dec 19 11:14:07 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 19 Dec 2016 21:14:07 +0200
Subject: AHCI driver on master - heads up
In-Reply-To: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
References: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org>

This is interesting.

AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you!


On 12/19/2016 08:35 PM, Matthew Dillon wrote:
> Some work has gone into the AHCI driver on master.  It should not affect
> anyone, but I'm posting a heads-up because this is the primary SATA storage
> device used on most machines and the changes made are significant enough
> that they could conceivably cause an issue.  So if you are using master, be
> sure you have a good backup kernel in case something goes wrong.
>
> The easiest way to create a backup kernel is to first install the new
> kernel.  Your previous kernel will be stripped and copied to
> /boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:
>
> cpdup /boot/kernel.old /boot/kernel.bak
>
> This way you don't waste space in /boot copying kernels and modules with
> full debug symbols.  Having a kernel.bak is a good idea because it will
> never be overwritten by an installkernel or other build target.  It will
> show up in the loader under 'b' for backup kernel.
>
> --
>
> Experimental FIS-Based Switching support has been added to the AHCI
> driver's port-multiplier code.  Most AHCI chipsets do not have this
> capability and the few that do tend to have fairly buggy hardware, so I am
> not recommending that port multipliers be used generally.  However, after
> years of waiting I finally got my hands on a mobo that had the feature and
> I've wanted to implement it forever, so I have done so.
>
> The FBS feature allows the computer to queue concurrent commands to
> multiple targets behind a port-multiplier.  Without it, only one target can
> be addressed at a time.  Test results with some old HDDs I had on a shelf
> can be found here.  As expected, concurrent performance to multiple targets
> is vastly improved.
>
> http://apollo.backplane.com/DFlyMisc/portmult01.txt
>
> However, I must again stress that AHCI-based port-multipliers tend to
> fall-over when errors occur, or even if you are just hot-swapping a drive
> sometimes.  Using a port-multiplier is not a good solution for serious
> setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
> in order to prevent it from competing with commercial SAS controllers.
>
> -Matt
>


From karu.pruun at gmail.com  Thu Dec 22 05:11:17 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Thu, 22 Dec 2016 15:11:17 +0200
Subject: question about hammer mirror-stream
In-Reply-To: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
 <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
Message-ID: <CADdF=MJsa-aDAGW6p2Fbx_ryD0TanKXASmUANDjXG5hVrEsc8w@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby <t+dfbsd at timdarby.net> wrote:

> On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:
>
>> then fileX will show up in /pfs/slave but not in /home/slave. It will
>> show up in the latter when I stop streaming, unmount the slave and mount it
>> again. Is this the intended behavior?
>>
>
> Yes, that's how it works. You won't see the updates to the mounted slave
> until you unmount and re-mount.
>


Thanks!

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161222/58827d29/attachment-0003.html>

From vincent.delft at gmail.com  Mon Dec 26 03:30:12 2016
From: vincent.delft at gmail.com (vincent delft)
Date: Mon, 26 Dec 2016 12:30:12 +0100
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
Message-ID: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>

Hello,

First of all I'm new to DF.
Thanks for the nice install process. It's a quite easy to have DF running.
But my concerns are more linked to HammerFS.

My context is the following:
- I would like to have a NAS system running on a small machine: 4GB RAM
having a celeron CPU of 1.6GHz.
- This NAS host +- 700GB of data
- I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
outside the NAS for DRP reasons.
- I'm looking for a solution tackling the bit-rot problem.


Concerning HammerFS, I've still read lot of DF manuals, pages, ... :

http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
https://www.dragonflybsd.org/hammer/

So basically, I've understood that I must install DF on a small SSD disk
and will dedicated the 1TB disk to HammerFS.

But I have some questions:
- What must be the setup of HammerFS to solve "bit-rot" ? Should it be
Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
should I create 2 slices on the disk?

- For DRP reasons, how can I perform a backup on an external disk ? (the
disk is only connected 1x per month on the NAS). Should I do a "dd" ?
should I use the master-slave concepts ? Should I use cpdup ?

Many thanks for your replies.

Vincent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161226/81f21ada/attachment-0003.html>

From kusumi.tomohiro at gmail.com  Mon Dec 26 04:15:08 2016
From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi)
Date: Mon, 26 Dec 2016 21:15:08 +0900
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <CAHndrBnCMCQ+3vb4HP2Ey+JEMfE_eGvZOKJ8UuObLA2rfg7B8A@mail.gmail.com>

In short, hammer doesn't solve (or fix) what you call bit-rot.


2016-12-26 20:30 GMT+09:00 vincent delft <vincent.delft at gmail.com>:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk and
> will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1
> or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ?
> Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2
> slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ? should
> I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>
>


From ipc at peercorpstrust.org  Mon Dec 26 05:06:24 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 26 Dec 2016 15:06:24 +0200
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org>

Hi Vincent,

Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine.

Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on.

HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication.

Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company:
https://bsdmag.org/siju_george/

HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs.

There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer

You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/

In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also.

If any of the above is in error, I imagine someone with deeper insight will chime in.

Mike

On 12/26/2016 01:30 PM, vincent delft wrote:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk
> and will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be
> Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
> filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
> should I create 2 slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ?
> should I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>


From dillon at backplane.com  Wed Dec 28 14:32:55 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Wed, 28 Dec 2016 14:32:55 -0800
Subject: Swap changes in master, enforcement of RLIMIT_RSS
Message-ID: <CAOZ7CpC5T=OeVR7-FP8f09D2giPEBde4fxboC4vemRisMxdgew@mail.gmail.com>

The maximum supported swap space has been significantly increased in
master.  The amount of core memory required to manage swap is approximately
1:128 (ram:swap-used).  The system allows up to half of your core ram to be
used for swap management so the maximum swap you can actually use on a
machine would be as follows:

With 1GB of ram it would be 512M*128 = 64GB of swap.

With 8GB of ram it would be 4G*128 = 512GB of swap.

With 128GB of ram it would be around ~8TB of swap.

The maximum the system supports due to KVM limitations is around ~32TB of
swap.

While you can configure more swap than the above limitations, the system
might not be able to actually manage as much as you configure.  I'm
considering improving the management code to reduce its memory
requirements.  Also, people shouldn't go overboard w/regards to configuring
swap space.  Configured swap does eat some memory resources even when not
actively in-use, and there might not be a practical use for having such
massive amounts of swap with your setup.

--

The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced
on a per-process basis.  By default this limit is infinity so nothing will
change for most people.  Setting a memoryuse limit has advantages in
certain circumstances as long as it isn't over-used.  For example, it might
be reasonable to start your browser with a 500m (500 megabyte) limit.  The
system will allow each process VSZ to grow larger, but will force the RSS
to the limit and devalue the pages it removes to make them more likely to
be recycled by the system.

There is a sysctl, vm.pageout_memuse_mode, to control the behavior.  It can
be set to 0, 1, or 2, and defaults to 1.  Setting it to 0 disables the
enforcement (system operates the same as it did before the commit).  The
default value of 1 enables enforcement by removing pages from the process
pmap and deactivating them passively.  The value 2 enables enforcement and
actively pages data out and frees pages.   Mode 2 can cause excessive
paging and wear out your SSD so it is not currently recommended.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161228/4ff8510c/attachment-0003.html>

From t+dfbsd at timdarby.net  Thu Dec  1 08:27:40 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Thu, 1 Dec 2016 09:27:40 -0700
Subject: Notice to Golang users
Message-ID: <CAP3DVK+RYA_MjvSnLwwXD1FjsKbi1dJkipLCPS+kJXUTf-v08Q@mail.gmail.com>

If you use the Go compiler with Dragonfly (it's in dports), be aware that
the upcoming release of Go 1.8 will require DF 4.4.4 or later:

https://beta.golang.org/doc/go1.8

?The Go team had put a workaround in place for DF for a thread signal
handling bug. This was recently fixed by Matt, so they've removed the
workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you
can use Go 1.7 or earlier.

Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161201/6aa509a7/attachment.htm>

From venture37 at geeklan.co.uk  Sun Dec  4 07:56:48 2016
From: venture37 at geeklan.co.uk (Sevan Janiyan)
Date: Sun, 4 Dec 2016 15:56:48 +0000
Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?=
 =?UTF-8?Q?icipation_extended?=
Message-ID: <e0ee874b-d0a4-ff74-210c-4a913f653227@geeklan.co.uk>


Hi,

The BSD devroom move the submission deadline to Dec 10th.
There's still time to submit a talk !

Regards,
- rodrigo
On behalf of the BSD devroom

Fosdem 2017 BSD devroom Call for Participation
==============================================

Important dates
--------------------
* Conference date      : 4 & 5 February 2017 in Brussels, Belgium
* Devroom date         : Sunday 5 February 2016
* Submission deadline  : Sunday 27 November 2016
* Speaker notified     : Sunday 4 December 2016

The topic of the devroom includes all BSD operating systems and every
talk is welcome from hacker discussions to real-word examples and
presentations about new and shiny features.

Practical
--------------------
* The default duration for talks will be 45 minutes including
   discussion. Feel free to ask if you want to have a longer or
   a shorter slot.

* Presentations can be recorded and streamed, sending your proposal
   implies giving permission to be recorded. However, exceptions can
   be made for exceptional circumstances.


To submit your proposal, visit :
https://penta.fosdem.org/submission/FOSDEM17

Account creation
----------------------
If you already have a Pentabarf account, please *don't* recreate a new
one. If you forgot your password, reset it. If not, follow the
instructions to create an account.

Submit a talk
----------------------
Create an ?event? and click on "Show all" in the top right corner
to display the full form.

Your submission must include the following information

* The title and subtitle of your talk (please be descriptive, as titles
   will be listed with ~500 from other projects)
* select ?BSD devroom? as the track.
* A short abstract of one paragraph
* A longer description if you wish to do so
* Links to related websites/blogs etc.

- Rodrigo Osorio
On behalf of the BSD Devroom


From ipc at peercorpstrust.org  Sat Dec 10 03:19:14 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 13:19:14 +0200
Subject: Parallel compression not using all available CPU
Message-ID: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>

Hi,

I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does.

When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top.

This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET 2016.

Has anyone else possibly observed this?

-- 
Mike


From justin at shiningsilence.com  Sat Dec 10 12:26:32 2016
From: justin at shiningsilence.com (Justin Sherrill)
Date: Sat, 10 Dec 2016 15:26:32 -0500
Subject: Parallel compression not using all available CPU
In-Reply-To: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
Message-ID: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>

On the two DragonFly systems, was it Hammer or UFS?  I would be
surprised if that made a difference, but it might?

On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
<ipc at peercorpstrust.org> wrote:
> Hi,
>
> I've observed that parallel compression tools such as pixz and lbzip2 do not
> make use of all of the available CPU under Dragonfly. On other OSes, it
> does.
>
> When testing on a 50 gb file, using top I've observed that CPU idle
> percentages consistently hover around the 90% range for pixz and ~70% for
> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
> compression is complete. Correspondingly, compression takes significantly
> longer under Dragonfly, so the CPU is really being under utilized in this
> case as opposed to erroneous reporting by top.
>
> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
> 2016.
>
> Has anyone else possibly observed this?
>
> --
> Mike
>


From ipc at peercorpstrust.org  Sat Dec 10 13:14:26 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 23:14:26 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>

Hi,

On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.


On 12/10/2016 10:26 PM, Justin Sherrill wrote:
> On the two DragonFly systems, was it Hammer or UFS?  I would be
> surprised if that made a difference, but it might?
>
> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
> <ipc at peercorpstrust.org> wrote:
>> Hi,
>>
>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>> make use of all of the available CPU under Dragonfly. On other OSes, it
>> does.
>>
>> When testing on a 50 gb file, using top I've observed that CPU idle
>> percentages consistently hover around the 90% range for pixz and ~70% for
>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>> compression is complete. Correspondingly, compression takes significantly
>> longer under Dragonfly, so the CPU is really being under utilized in this
>> case as opposed to erroneous reporting by top.
>>
>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>> 2016.
>>
>> Has anyone else possibly observed this?
>>
>> --
>> Mike
>>
>


From dillon at apollo.backplane.com  Sat Dec 10 13:29:24 2016
From: dillon at apollo.backplane.com (Matthew Dillon)
Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST)
Subject: Parallel compression not using all available CPU
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com>


:Hi,
:
:On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.

    Generally speaking I wouldn't want these utilities to use all available
    cpu threads by default.  It's not usually a good idea for a program to
    just assume that all the cpus are there for it (by default).  But they
    should have command line options that allow the number of threads to
    be specified.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


From jasse at yberwaffe.com  Sat Dec 10 16:00:02 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Sun, 11 Dec 2016 01:00:02 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>

Have you tried to disable hypertreads in the BIOS ???
It's a long shot, I know, but it might help.

On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
> Hi,
>
> On both systems HAMMER was used. One small correction concerning the 
> 2c/2t machine, both compression programs did effectively utilize that 
> CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t 
> where the CPU isn't effectively maxed out. I'll continue to try and 
> investigate why and report back if I find anything.
>
>
> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>> surprised if that made a difference, but it might?
>>
>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>> <ipc at peercorpstrust.org> wrote:
>>> Hi,
>>>
>>> I've observed that parallel compression tools such as pixz and 
>>> lbzip2 do not
>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>> does.
>>>
>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>> percentages consistently hover around the 90% range for pixz and 
>>> ~70% for
>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>> idle until
>>> compression is complete. Correspondingly, compression takes 
>>> significantly
>>> longer under Dragonfly, so the CPU is really being under utilized in 
>>> this
>>> case as opposed to erroneous reporting by top.
>>>
>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a 
>>> recent
>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>> 11:44:04 EET
>>> 2016.
>>>
>>> Has anyone else possibly observed this?
>>>
>>> -- 
>>> Mike
>>>
>>
>


From ipc at peercorpstrust.org  Sun Dec 11 13:22:44 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sun, 11 Dec 2016 23:22:44 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>

Hi,

It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk.

I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU:

########################################################################

"bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache.

Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2.

Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time.

Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB."

########################################################################

Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%).

Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made.


On 12/11/2016 02:00 AM, Jasse Jansson wrote:
> Have you tried to disable hypertreads in the BIOS ???
> It's a long shot, I know, but it might help.
>
> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>> Hi,
>>
>> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.
>>
>>
>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>> surprised if that made a difference, but it might?
>>>
>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>> <ipc at peercorpstrust.org> wrote:
>>>> Hi,
>>>>
>>>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>> does.
>>>>
>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>> percentages consistently hover around the 90% range for pixz and ~70% for
>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>>>> compression is complete. Correspondingly, compression takes significantly
>>>> longer under Dragonfly, so the CPU is really being under utilized in this
>>>> case as opposed to erroneous reporting by top.
>>>>
>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>>>> 2016.
>>>>
>>>> Has anyone else possibly observed this?
>>>>
>>>> --
>>>> Mike
>>>>
>>>
>>
>
>


From dillon at backplane.com  Sun Dec 11 15:37:36 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sun, 11 Dec 2016 15:37:36 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
Message-ID: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>

That doesn't make any sense.  It sounds like it is just compressing more
slowly, so there is less idle time because the HDD/SSD is able to keep up
due to it compressing more slowly.  You don't want to turn off
hyperthreading in the BIOS and cache coherency stalls will not show up in
the idle% anyway.

-Matt

On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
ipc at peercorpstrust.org> wrote:

> Hi,
>
> It turns out that it was a combination of two things - turning off
> hyperthreading in BIOS and using a faster disk.
>
> I found a post from the author of lbzip2 which seems to describe what
> might be happening in this case, but reference was made to a user using an
> i5 mobile CPU:
>
> ########################################################################
>
> "bzip2 author here. I strongly suspect that you see what you see because
> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
> quad-core. Meaning, you have two instances of the L2 per-core cache, not
> four, and each two hyperthreads share an L2 cache.
>
> Since the bzip2 compression/decompression is very cache sensitive (see
> "man bzip2"), the scaling factor will be determined mostly by how many
> OS-threads can dispose over a dedicated cache each. In your case this
> number is probably 2.
>
> Since you run two threads per core, those contend for the shared L2 cache,
> basically each messing with the other (flushing / invalidating the shared
> cache for the other). This contention shows up as double CPU time, because
> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
> time.
>
> Hyperthreading is not useful but detrimental for lbzip2; so you should
> export LBZIP2="-n 2". You should not run more worker threads per core than:
> core-dedicated-cache-size divided by 8MB."
>
> ########################################################################
>
> Running the compression again on the same file from an SSD with
> hyperthreading turned off, I was able to fully saturate all of the cores
> using lbzip2. None of this seemed obvious at first, but it rectified the
> situation. The biggest difference came from turning off hyperthreading
> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
> hyperthreading turned off (idle CPU = 0%).
>
> Previously, the compression was run from a single HDD, not an SSD.
> Concerning the compression test using the same HDD under FreeBSD, well I
> don't know why it was able to saturate the CPU. Perhaps it has something to
> do with ZFS's aggressive caching. Turning that off and re-running the test
> would likely answer the question. Pixz performed similarly when the above
> two modifications were made.
>
>
>
>
> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>
>> Have you tried to disable hypertreads in the BIOS ???
>> It's a long shot, I know, but it might help.
>>
>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>
>>> Hi,
>>>
>>> On both systems HAMMER was used. One small correction concerning the
>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>> report back if I find anything.
>>>
>>>
>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>
>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>> surprised if that made a difference, but it might?
>>>>
>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>> <ipc at peercorpstrust.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>> do not
>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>> does.
>>>>>
>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>> for
>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>> until
>>>>> compression is complete. Correspondingly, compression takes
>>>>> significantly
>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>> this
>>>>> case as opposed to erroneous reporting by top.
>>>>>
>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>> recent
>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>> EET
>>>>> 2016.
>>>>>
>>>>> Has anyone else possibly observed this?
>>>>>
>>>>> --
>>>>> Mike
>>>>>
>>>>>
>>>>
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161211/dbbb00e9/attachment.htm>

From ipc at peercorpstrust.org  Sun Dec 11 22:14:29 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 12 Dec 2016 08:14:29 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>

Thanks for this.

Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on?

On 12/12/2016 01:37 AM, Matthew Dillon wrote:
> That doesn't make any sense.  It sounds like it is just compressing more
> slowly, so there is less idle time because the HDD/SSD is able to keep up
> due to it compressing more slowly.  You don't want to turn off
> hyperthreading in the BIOS and cache coherency stalls will not show up in
> the idle% anyway.
>
> -Matt
>
> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
> ipc at peercorpstrust.org> wrote:
>
>> Hi,
>>
>> It turns out that it was a combination of two things - turning off
>> hyperthreading in BIOS and using a faster disk.
>>
>> I found a post from the author of lbzip2 which seems to describe what
>> might be happening in this case, but reference was made to a user using an
>> i5 mobile CPU:
>>
>> ########################################################################
>>
>> "bzip2 author here. I strongly suspect that you see what you see because
>> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
>> quad-core. Meaning, you have two instances of the L2 per-core cache, not
>> four, and each two hyperthreads share an L2 cache.
>>
>> Since the bzip2 compression/decompression is very cache sensitive (see
>> "man bzip2"), the scaling factor will be determined mostly by how many
>> OS-threads can dispose over a dedicated cache each. In your case this
>> number is probably 2.
>>
>> Since you run two threads per core, those contend for the shared L2 cache,
>> basically each messing with the other (flushing / invalidating the shared
>> cache for the other). This contention shows up as double CPU time, because
>> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
>> time.
>>
>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>> export LBZIP2="-n 2". You should not run more worker threads per core than:
>> core-dedicated-cache-size divided by 8MB."
>>
>> ########################################################################
>>
>> Running the compression again on the same file from an SSD with
>> hyperthreading turned off, I was able to fully saturate all of the cores
>> using lbzip2. None of this seemed obvious at first, but it rectified the
>> situation. The biggest difference came from turning off hyperthreading
>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>> hyperthreading turned off (idle CPU = 0%).
>>
>> Previously, the compression was run from a single HDD, not an SSD.
>> Concerning the compression test using the same HDD under FreeBSD, well I
>> don't know why it was able to saturate the CPU. Perhaps it has something to
>> do with ZFS's aggressive caching. Turning that off and re-running the test
>> would likely answer the question. Pixz performed similarly when the above
>> two modifications were made.
>>
>>
>>
>>
>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>
>>> Have you tried to disable hypertreads in the BIOS ???
>>> It's a long shot, I know, but it might help.
>>>
>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>
>>>> Hi,
>>>>
>>>> On both systems HAMMER was used. One small correction concerning the
>>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>>> report back if I find anything.
>>>>
>>>>
>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>
>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>> surprised if that made a difference, but it might?
>>>>>
>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>>> do not
>>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>>> does.
>>>>>>
>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>>> for
>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>>> until
>>>>>> compression is complete. Correspondingly, compression takes
>>>>>> significantly
>>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>>> this
>>>>>> case as opposed to erroneous reporting by top.
>>>>>>
>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>> recent
>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>>> EET
>>>>>> 2016.
>>>>>>
>>>>>> Has anyone else possibly observed this?
>>>>>>
>>>>>> --
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>


From jasse at yberwaffe.com  Mon Dec 12 05:41:40 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Mon, 12 Dec 2016 14:41:40 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>

They recommend to turn hyperthreading off if you run studio software on 
your computer.
That's if you run Windows, I have no idea if HT affects a Unix 
derivative anyway.

On 2016-12-12 07:14, PeerCorps Trust Fund wrote:
> Thanks for this.
>
> Is there ever a side case where hyperthreading might have 
> unpredictable results or should it generally always be left on?
>
> On 12/12/2016 01:37 AM, Matthew Dillon wrote:
>> That doesn't make any sense.  It sounds like it is just compressing more
>> slowly, so there is less idle time because the HDD/SSD is able to 
>> keep up
>> due to it compressing more slowly.  You don't want to turn off
>> hyperthreading in the BIOS and cache coherency stalls will not show 
>> up in
>> the idle% anyway.
>>
>> -Matt
>>
>> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
>> ipc at peercorpstrust.org> wrote:
>>
>>> Hi,
>>>
>>> It turns out that it was a combination of two things - turning off
>>> hyperthreading in BIOS and using a faster disk.
>>>
>>> I found a post from the author of lbzip2 which seems to describe what
>>> might be happening in this case, but reference was made to a user 
>>> using an
>>> i5 mobile CPU:
>>>
>>> ######################################################################## 
>>>
>>>
>>> "bzip2 author here. I strongly suspect that you see what you see 
>>> because
>>> your Intel core i5 is probably only dual core PLUS hyper-threaded, 
>>> not real
>>> quad-core. Meaning, you have two instances of the L2 per-core cache, 
>>> not
>>> four, and each two hyperthreads share an L2 cache.
>>>
>>> Since the bzip2 compression/decompression is very cache sensitive (see
>>> "man bzip2"), the scaling factor will be determined mostly by how many
>>> OS-threads can dispose over a dedicated cache each. In your case this
>>> number is probably 2.
>>>
>>> Since you run two threads per core, those contend for the shared L2 
>>> cache,
>>> basically each messing with the other (flushing / invalidating the 
>>> shared
>>> cache for the other). This contention shows up as double CPU time, 
>>> because
>>> "waiting for cache" (or "waiting for main memory") is accounted for 
>>> as CPU
>>> time.
>>>
>>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>>> export LBZIP2="-n 2". You should not run more worker threads per 
>>> core than:
>>> core-dedicated-cache-size divided by 8MB."
>>>
>>> ######################################################################## 
>>>
>>>
>>> Running the compression again on the same file from an SSD with
>>> hyperthreading turned off, I was able to fully saturate all of the 
>>> cores
>>> using lbzip2. None of this seemed obvious at first, but it rectified 
>>> the
>>> situation. The biggest difference came from turning off hyperthreading
>>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>>> hyperthreading turned off (idle CPU = 0%).
>>>
>>> Previously, the compression was run from a single HDD, not an SSD.
>>> Concerning the compression test using the same HDD under FreeBSD, 
>>> well I
>>> don't know why it was able to saturate the CPU. Perhaps it has 
>>> something to
>>> do with ZFS's aggressive caching. Turning that off and re-running 
>>> the test
>>> would likely answer the question. Pixz performed similarly when the 
>>> above
>>> two modifications were made.
>>>
>>>
>>>
>>>
>>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>>
>>>> Have you tried to disable hypertreads in the BIOS ???
>>>> It's a long shot, I know, but it might help.
>>>>
>>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> On both systems HAMMER was used. One small correction concerning the
>>>>> 2c/2t machine, both compression programs did effectively utilize 
>>>>> that CPU
>>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t 
>>>>> where the CPU
>>>>> isn't effectively maxed out. I'll continue to try and investigate 
>>>>> why and
>>>>> report back if I find anything.
>>>>>
>>>>>
>>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>>
>>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>>> surprised if that made a difference, but it might?
>>>>>>
>>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've observed that parallel compression tools such as pixz and 
>>>>>>> lbzip2
>>>>>>> do not
>>>>>>> make use of all of the available CPU under Dragonfly. On other 
>>>>>>> OSes, it
>>>>>>> does.
>>>>>>>
>>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>>> percentages consistently hover around the 90% range for pixz and 
>>>>>>> ~70%
>>>>>>> for
>>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>>>>>> idle
>>>>>>> until
>>>>>>> compression is complete. Correspondingly, compression takes
>>>>>>> significantly
>>>>>>> longer under Dragonfly, so the CPU is really being under 
>>>>>>> utilized in
>>>>>>> this
>>>>>>> case as opposed to erroneous reporting by top.
>>>>>>>
>>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>>> recent
>>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>>>>>> 11:44:04
>>>>>>> EET
>>>>>>> 2016.
>>>>>>>
>>>>>>> Has anyone else possibly observed this?
>>>>>>>
>>>>>>> -- 
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>


From dillon at backplane.com  Tue Dec 13 12:56:41 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Tue, 13 Dec 2016 12:56:41 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
 <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
Message-ID: <CAOZ7CpDz_zKSr4OfQTQWcGApG=mNsqGv7gO4b90BJCzrohb5dg@mail.gmail.com>

Never turn hyper-threading off if running (any) BSD or Linux.  The
schedulers understand hyperthreads.

In terms of compression programs, presumably they have an option to specify
how many threads to create to do the compression.  Simply test with 1, 2,
and 4 threads to determine the optimal number.

-Matt
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161213/0eb2e7f6/attachment.htm>

From dillon at backplane.com  Sat Dec 17 12:10:35 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:10:35 -0800
Subject: Graphics-related memory leak fixed in master for post-haswell machines
Message-ID: <CAOZ7CpDurD7AD-3E5sGnZBBCOPLt+0PPoDg=H6mV7EXivCT8Kw@mail.gmail.com>

A memory leak has been fixed in master's drm (gpu) code which affects
post-haswell intel GPUs.  i.e. Broadwell, Skylake, etc.  This leak would
cause a OOM panic typically after a couple of weeks of uptime.  The leak
does not affect older Haswell based intel systems.  A number of other
stability improvements have also gone into the system.

The release branch got some of the stability work but the fix for the
graphics memory leak is a bit too complex to backport in a compatible
manner.  At the current juncture, anyone running a workstation with release
might want to switch to master.  Master will be *much* better on broadwell
or later GPUs.

Also somewhat related, chromium (chrome) has been updated from v52 to v54
on both branches, which fixes some serious GUI issues on high-dpi (aka 4K)
monitors.  Chrome 54 still has some issues with certain sites, but
generally speaking it seems to be pretty stable and I recommend its use.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/1da04d0a/attachment.htm>

From dillon at backplane.com  Sat Dec 17 12:19:31 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:19:31 -0800
Subject: Setting up a more secure browser environment
Message-ID: <CAOZ7CpAYYTK_QsAfQ4vEKT6BLSnBtm04+88XJ=NPemwoW187Pw@mail.gmail.com>

I have created a new page in the handbook describing how to set up a secure
browser environment using isolation accounts.  We recommend running your
browser this way instead of running it directly from your main workstation
account.

https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/

Currently chrome is considered to be the most secure.  Now that chrome 54
is in dports, it should generate a pretty good experience for users.  There
are some sites which still implode on it (and in FreeBSD too, that isn't a
DFly-specific problem), and for those you may have to run firefox.  My
personal preference is chrome with some adjustments to the setup.

For people running chrome on a 4K monitor, I recommend running chrome with
the "--force-device-scale-factor=1.5" option before you start messing with
font or other settings.  It will scale the UI a little closer to what you
probably want.

DragonFly master's support through Intel Skylake is now very good.  I am
running a Skylake box as my workstation now and it has been very stable.
Theoretically we have some Karby support in there too but no guarantees on
stability.  Radeon support has also seen some improvement though not at the
same pace as Intel GPU support has seen.

At this point, DragonFly runs extremely well even on mobile chipsets
(though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in
terms of performance).  I recommend a minimum of 4GB of ram if using X.
DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or
NVMe SSD and a large swap partition to page to for decent browser
performance.  Browsers are still big memory hogs.  Eventually we will have
chrome 55 support which is supposed to eat much less memory, but not yet.
 chrome 54 is where we are now.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/e050f279/attachment.htm>

From karu.pruun at gmail.com  Mon Dec 19 05:21:56 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Mon, 19 Dec 2016 15:21:56 +0200
Subject: question about hammer mirror-stream
Message-ID: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>

Hello

A quick question regarding hammer mirror-streaming. The master and slave
are mounted as

# mount_null /pfs/master /home/master
# mount_null /pfs/slave /home/slave

and streaming is ongoing with

# hammer mirror-stream /home/master /home/slave

When there's a change in master, e.g.

% sudo touch /home/master/fileX

then fileX will show up in /pfs/slave but not in /home/slave. It will show
up in the latter when I stop streaming, unmount the slave and mount it
again. Is this the intended behavior?

Thanks

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/e2caa040/attachment.htm>

From t+dfbsd at timdarby.net  Mon Dec 19 08:50:06 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Mon, 19 Dec 2016 09:50:06 -0700
Subject: question about hammer mirror-stream
In-Reply-To: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
Message-ID: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:

> then fileX will show up in /pfs/slave but not in /home/slave. It will show
> up in the latter when I stop streaming, unmount the slave and mount it
> again. Is this the intended behavior?
>

Yes, that's how it works. You won't see the updates to the mounted slave
until you unmount and re-mount.

Tim?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/07448b6c/attachment.htm>

From dillon at backplane.com  Mon Dec 19 10:35:29 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Mon, 19 Dec 2016 10:35:29 -0800
Subject: AHCI driver on master - heads up
Message-ID: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>

Some work has gone into the AHCI driver on master.  It should not affect
anyone, but I'm posting a heads-up because this is the primary SATA storage
device used on most machines and the changes made are significant enough
that they could conceivably cause an issue.  So if you are using master, be
sure you have a good backup kernel in case something goes wrong.

The easiest way to create a backup kernel is to first install the new
kernel.  Your previous kernel will be stripped and copied to
/boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:

cpdup /boot/kernel.old /boot/kernel.bak

This way you don't waste space in /boot copying kernels and modules with
full debug symbols.  Having a kernel.bak is a good idea because it will
never be overwritten by an installkernel or other build target.  It will
show up in the loader under 'b' for backup kernel.

--

Experimental FIS-Based Switching support has been added to the AHCI
driver's port-multiplier code.  Most AHCI chipsets do not have this
capability and the few that do tend to have fairly buggy hardware, so I am
not recommending that port multipliers be used generally.  However, after
years of waiting I finally got my hands on a mobo that had the feature and
I've wanted to implement it forever, so I have done so.

The FBS feature allows the computer to queue concurrent commands to
multiple targets behind a port-multiplier.  Without it, only one target can
be addressed at a time.  Test results with some old HDDs I had on a shelf
can be found here.  As expected, concurrent performance to multiple targets
is vastly improved.

http://apollo.backplane.com/DFlyMisc/portmult01.txt

However, I must again stress that AHCI-based port-multipliers tend to
fall-over when errors occur, or even if you are just hot-swapping a drive
sometimes.  Using a port-multiplier is not a good solution for serious
setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
in order to prevent it from competing with commercial SAS controllers.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/ba061b2e/attachment.htm>

From ipc at peercorpstrust.org  Mon Dec 19 11:14:07 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 19 Dec 2016 21:14:07 +0200
Subject: AHCI driver on master - heads up
In-Reply-To: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
References: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org>

This is interesting.

AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you!


On 12/19/2016 08:35 PM, Matthew Dillon wrote:
> Some work has gone into the AHCI driver on master.  It should not affect
> anyone, but I'm posting a heads-up because this is the primary SATA storage
> device used on most machines and the changes made are significant enough
> that they could conceivably cause an issue.  So if you are using master, be
> sure you have a good backup kernel in case something goes wrong.
>
> The easiest way to create a backup kernel is to first install the new
> kernel.  Your previous kernel will be stripped and copied to
> /boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:
>
> cpdup /boot/kernel.old /boot/kernel.bak
>
> This way you don't waste space in /boot copying kernels and modules with
> full debug symbols.  Having a kernel.bak is a good idea because it will
> never be overwritten by an installkernel or other build target.  It will
> show up in the loader under 'b' for backup kernel.
>
> --
>
> Experimental FIS-Based Switching support has been added to the AHCI
> driver's port-multiplier code.  Most AHCI chipsets do not have this
> capability and the few that do tend to have fairly buggy hardware, so I am
> not recommending that port multipliers be used generally.  However, after
> years of waiting I finally got my hands on a mobo that had the feature and
> I've wanted to implement it forever, so I have done so.
>
> The FBS feature allows the computer to queue concurrent commands to
> multiple targets behind a port-multiplier.  Without it, only one target can
> be addressed at a time.  Test results with some old HDDs I had on a shelf
> can be found here.  As expected, concurrent performance to multiple targets
> is vastly improved.
>
> http://apollo.backplane.com/DFlyMisc/portmult01.txt
>
> However, I must again stress that AHCI-based port-multipliers tend to
> fall-over when errors occur, or even if you are just hot-swapping a drive
> sometimes.  Using a port-multiplier is not a good solution for serious
> setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
> in order to prevent it from competing with commercial SAS controllers.
>
> -Matt
>


From karu.pruun at gmail.com  Thu Dec 22 05:11:17 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Thu, 22 Dec 2016 15:11:17 +0200
Subject: question about hammer mirror-stream
In-Reply-To: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
 <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
Message-ID: <CADdF=MJsa-aDAGW6p2Fbx_ryD0TanKXASmUANDjXG5hVrEsc8w@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby <t+dfbsd at timdarby.net> wrote:

> On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:
>
>> then fileX will show up in /pfs/slave but not in /home/slave. It will
>> show up in the latter when I stop streaming, unmount the slave and mount it
>> again. Is this the intended behavior?
>>
>
> Yes, that's how it works. You won't see the updates to the mounted slave
> until you unmount and re-mount.
>


Thanks!

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161222/58827d29/attachment.htm>

From vincent.delft at gmail.com  Mon Dec 26 03:30:12 2016
From: vincent.delft at gmail.com (vincent delft)
Date: Mon, 26 Dec 2016 12:30:12 +0100
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
Message-ID: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>

Hello,

First of all I'm new to DF.
Thanks for the nice install process. It's a quite easy to have DF running.
But my concerns are more linked to HammerFS.

My context is the following:
- I would like to have a NAS system running on a small machine: 4GB RAM
having a celeron CPU of 1.6GHz.
- This NAS host +- 700GB of data
- I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
outside the NAS for DRP reasons.
- I'm looking for a solution tackling the bit-rot problem.


Concerning HammerFS, I've still read lot of DF manuals, pages, ... :

http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
https://www.dragonflybsd.org/hammer/

So basically, I've understood that I must install DF on a small SSD disk
and will dedicated the 1TB disk to HammerFS.

But I have some questions:
- What must be the setup of HammerFS to solve "bit-rot" ? Should it be
Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
should I create 2 slices on the disk?

- For DRP reasons, how can I perform a backup on an external disk ? (the
disk is only connected 1x per month on the NAS). Should I do a "dd" ?
should I use the master-slave concepts ? Should I use cpdup ?

Many thanks for your replies.

Vincent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161226/81f21ada/attachment.htm>

From kusumi.tomohiro at gmail.com  Mon Dec 26 04:15:08 2016
From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi)
Date: Mon, 26 Dec 2016 21:15:08 +0900
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <CAHndrBnCMCQ+3vb4HP2Ey+JEMfE_eGvZOKJ8UuObLA2rfg7B8A@mail.gmail.com>

In short, hammer doesn't solve (or fix) what you call bit-rot.


2016-12-26 20:30 GMT+09:00 vincent delft <vincent.delft at gmail.com>:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk and
> will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1
> or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ?
> Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2
> slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ? should
> I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>
>


From ipc at peercorpstrust.org  Mon Dec 26 05:06:24 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 26 Dec 2016 15:06:24 +0200
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org>

Hi Vincent,

Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine.

Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on.

HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication.

Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company:
https://bsdmag.org/siju_george/

HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs.

There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer

You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/

In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also.

If any of the above is in error, I imagine someone with deeper insight will chime in.

Mike

On 12/26/2016 01:30 PM, vincent delft wrote:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk
> and will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be
> Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
> filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
> should I create 2 slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ?
> should I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>


From dillon at backplane.com  Wed Dec 28 14:32:55 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Wed, 28 Dec 2016 14:32:55 -0800
Subject: Swap changes in master, enforcement of RLIMIT_RSS
Message-ID: <CAOZ7CpC5T=OeVR7-FP8f09D2giPEBde4fxboC4vemRisMxdgew@mail.gmail.com>

The maximum supported swap space has been significantly increased in
master.  The amount of core memory required to manage swap is approximately
1:128 (ram:swap-used).  The system allows up to half of your core ram to be
used for swap management so the maximum swap you can actually use on a
machine would be as follows:

With 1GB of ram it would be 512M*128 = 64GB of swap.

With 8GB of ram it would be 4G*128 = 512GB of swap.

With 128GB of ram it would be around ~8TB of swap.

The maximum the system supports due to KVM limitations is around ~32TB of
swap.

While you can configure more swap than the above limitations, the system
might not be able to actually manage as much as you configure.  I'm
considering improving the management code to reduce its memory
requirements.  Also, people shouldn't go overboard w/regards to configuring
swap space.  Configured swap does eat some memory resources even when not
actively in-use, and there might not be a practical use for having such
massive amounts of swap with your setup.

--

The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced
on a per-process basis.  By default this limit is infinity so nothing will
change for most people.  Setting a memoryuse limit has advantages in
certain circumstances as long as it isn't over-used.  For example, it might
be reasonable to start your browser with a 500m (500 megabyte) limit.  The
system will allow each process VSZ to grow larger, but will force the RSS
to the limit and devalue the pages it removes to make them more likely to
be recycled by the system.

There is a sysctl, vm.pageout_memuse_mode, to control the behavior.  It can
be set to 0, 1, or 2, and defaults to 1.  Setting it to 0 disables the
enforcement (system operates the same as it did before the commit).  The
default value of 1 enables enforcement by removing pages from the process
pmap and deactivating them passively.  The value 2 enables enforcement and
actively pages data out and frees pages.   Mode 2 can cause excessive
paging and wear out your SSD so it is not currently recommended.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161228/4ff8510c/attachment.htm>

From t+dfbsd at timdarby.net  Thu Dec  1 08:27:40 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Thu, 1 Dec 2016 09:27:40 -0700
Subject: Notice to Golang users
Message-ID: <CAP3DVK+RYA_MjvSnLwwXD1FjsKbi1dJkipLCPS+kJXUTf-v08Q@mail.gmail.com>

If you use the Go compiler with Dragonfly (it's in dports), be aware that
the upcoming release of Go 1.8 will require DF 4.4.4 or later:

https://beta.golang.org/doc/go1.8

?The Go team had put a workaround in place for DF for a thread signal
handling bug. This was recently fixed by Matt, so they've removed the
workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you
can use Go 1.7 or earlier.

Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161201/6aa509a7/attachment-0001.htm>

From venture37 at geeklan.co.uk  Sun Dec  4 07:56:48 2016
From: venture37 at geeklan.co.uk (Sevan Janiyan)
Date: Sun, 4 Dec 2016 15:56:48 +0000
Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?=
 =?UTF-8?Q?icipation_extended?=
Message-ID: <e0ee874b-d0a4-ff74-210c-4a913f653227@geeklan.co.uk>


Hi,

The BSD devroom move the submission deadline to Dec 10th.
There's still time to submit a talk !

Regards,
- rodrigo
On behalf of the BSD devroom

Fosdem 2017 BSD devroom Call for Participation
==============================================

Important dates
--------------------
* Conference date      : 4 & 5 February 2017 in Brussels, Belgium
* Devroom date         : Sunday 5 February 2016
* Submission deadline  : Sunday 27 November 2016
* Speaker notified     : Sunday 4 December 2016

The topic of the devroom includes all BSD operating systems and every
talk is welcome from hacker discussions to real-word examples and
presentations about new and shiny features.

Practical
--------------------
* The default duration for talks will be 45 minutes including
   discussion. Feel free to ask if you want to have a longer or
   a shorter slot.

* Presentations can be recorded and streamed, sending your proposal
   implies giving permission to be recorded. However, exceptions can
   be made for exceptional circumstances.


To submit your proposal, visit :
https://penta.fosdem.org/submission/FOSDEM17

Account creation
----------------------
If you already have a Pentabarf account, please *don't* recreate a new
one. If you forgot your password, reset it. If not, follow the
instructions to create an account.

Submit a talk
----------------------
Create an ?event? and click on "Show all" in the top right corner
to display the full form.

Your submission must include the following information

* The title and subtitle of your talk (please be descriptive, as titles
   will be listed with ~500 from other projects)
* select ?BSD devroom? as the track.
* A short abstract of one paragraph
* A longer description if you wish to do so
* Links to related websites/blogs etc.

- Rodrigo Osorio
On behalf of the BSD Devroom


From ipc at peercorpstrust.org  Sat Dec 10 03:19:14 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 13:19:14 +0200
Subject: Parallel compression not using all available CPU
Message-ID: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>

Hi,

I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does.

When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top.

This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET 2016.

Has anyone else possibly observed this?

-- 
Mike


From justin at shiningsilence.com  Sat Dec 10 12:26:32 2016
From: justin at shiningsilence.com (Justin Sherrill)
Date: Sat, 10 Dec 2016 15:26:32 -0500
Subject: Parallel compression not using all available CPU
In-Reply-To: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
Message-ID: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>

On the two DragonFly systems, was it Hammer or UFS?  I would be
surprised if that made a difference, but it might?

On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
<ipc at peercorpstrust.org> wrote:
> Hi,
>
> I've observed that parallel compression tools such as pixz and lbzip2 do not
> make use of all of the available CPU under Dragonfly. On other OSes, it
> does.
>
> When testing on a 50 gb file, using top I've observed that CPU idle
> percentages consistently hover around the 90% range for pixz and ~70% for
> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
> compression is complete. Correspondingly, compression takes significantly
> longer under Dragonfly, so the CPU is really being under utilized in this
> case as opposed to erroneous reporting by top.
>
> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
> 2016.
>
> Has anyone else possibly observed this?
>
> --
> Mike
>


From ipc at peercorpstrust.org  Sat Dec 10 13:14:26 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 23:14:26 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>

Hi,

On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.


On 12/10/2016 10:26 PM, Justin Sherrill wrote:
> On the two DragonFly systems, was it Hammer or UFS?  I would be
> surprised if that made a difference, but it might?
>
> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
> <ipc at peercorpstrust.org> wrote:
>> Hi,
>>
>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>> make use of all of the available CPU under Dragonfly. On other OSes, it
>> does.
>>
>> When testing on a 50 gb file, using top I've observed that CPU idle
>> percentages consistently hover around the 90% range for pixz and ~70% for
>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>> compression is complete. Correspondingly, compression takes significantly
>> longer under Dragonfly, so the CPU is really being under utilized in this
>> case as opposed to erroneous reporting by top.
>>
>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>> 2016.
>>
>> Has anyone else possibly observed this?
>>
>> --
>> Mike
>>
>


From dillon at apollo.backplane.com  Sat Dec 10 13:29:24 2016
From: dillon at apollo.backplane.com (Matthew Dillon)
Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST)
Subject: Parallel compression not using all available CPU
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com>


:Hi,
:
:On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.

    Generally speaking I wouldn't want these utilities to use all available
    cpu threads by default.  It's not usually a good idea for a program to
    just assume that all the cpus are there for it (by default).  But they
    should have command line options that allow the number of threads to
    be specified.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


From jasse at yberwaffe.com  Sat Dec 10 16:00:02 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Sun, 11 Dec 2016 01:00:02 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>

Have you tried to disable hypertreads in the BIOS ???
It's a long shot, I know, but it might help.

On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
> Hi,
>
> On both systems HAMMER was used. One small correction concerning the 
> 2c/2t machine, both compression programs did effectively utilize that 
> CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t 
> where the CPU isn't effectively maxed out. I'll continue to try and 
> investigate why and report back if I find anything.
>
>
> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>> surprised if that made a difference, but it might?
>>
>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>> <ipc at peercorpstrust.org> wrote:
>>> Hi,
>>>
>>> I've observed that parallel compression tools such as pixz and 
>>> lbzip2 do not
>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>> does.
>>>
>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>> percentages consistently hover around the 90% range for pixz and 
>>> ~70% for
>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>> idle until
>>> compression is complete. Correspondingly, compression takes 
>>> significantly
>>> longer under Dragonfly, so the CPU is really being under utilized in 
>>> this
>>> case as opposed to erroneous reporting by top.
>>>
>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a 
>>> recent
>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>> 11:44:04 EET
>>> 2016.
>>>
>>> Has anyone else possibly observed this?
>>>
>>> -- 
>>> Mike
>>>
>>
>


From ipc at peercorpstrust.org  Sun Dec 11 13:22:44 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sun, 11 Dec 2016 23:22:44 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>

Hi,

It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk.

I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU:

########################################################################

"bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache.

Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2.

Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time.

Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB."

########################################################################

Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%).

Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made.


On 12/11/2016 02:00 AM, Jasse Jansson wrote:
> Have you tried to disable hypertreads in the BIOS ???
> It's a long shot, I know, but it might help.
>
> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>> Hi,
>>
>> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.
>>
>>
>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>> surprised if that made a difference, but it might?
>>>
>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>> <ipc at peercorpstrust.org> wrote:
>>>> Hi,
>>>>
>>>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>> does.
>>>>
>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>> percentages consistently hover around the 90% range for pixz and ~70% for
>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>>>> compression is complete. Correspondingly, compression takes significantly
>>>> longer under Dragonfly, so the CPU is really being under utilized in this
>>>> case as opposed to erroneous reporting by top.
>>>>
>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>>>> 2016.
>>>>
>>>> Has anyone else possibly observed this?
>>>>
>>>> --
>>>> Mike
>>>>
>>>
>>
>
>


From dillon at backplane.com  Sun Dec 11 15:37:36 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sun, 11 Dec 2016 15:37:36 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
Message-ID: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>

That doesn't make any sense.  It sounds like it is just compressing more
slowly, so there is less idle time because the HDD/SSD is able to keep up
due to it compressing more slowly.  You don't want to turn off
hyperthreading in the BIOS and cache coherency stalls will not show up in
the idle% anyway.

-Matt

On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
ipc at peercorpstrust.org> wrote:

> Hi,
>
> It turns out that it was a combination of two things - turning off
> hyperthreading in BIOS and using a faster disk.
>
> I found a post from the author of lbzip2 which seems to describe what
> might be happening in this case, but reference was made to a user using an
> i5 mobile CPU:
>
> ########################################################################
>
> "bzip2 author here. I strongly suspect that you see what you see because
> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
> quad-core. Meaning, you have two instances of the L2 per-core cache, not
> four, and each two hyperthreads share an L2 cache.
>
> Since the bzip2 compression/decompression is very cache sensitive (see
> "man bzip2"), the scaling factor will be determined mostly by how many
> OS-threads can dispose over a dedicated cache each. In your case this
> number is probably 2.
>
> Since you run two threads per core, those contend for the shared L2 cache,
> basically each messing with the other (flushing / invalidating the shared
> cache for the other). This contention shows up as double CPU time, because
> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
> time.
>
> Hyperthreading is not useful but detrimental for lbzip2; so you should
> export LBZIP2="-n 2". You should not run more worker threads per core than:
> core-dedicated-cache-size divided by 8MB."
>
> ########################################################################
>
> Running the compression again on the same file from an SSD with
> hyperthreading turned off, I was able to fully saturate all of the cores
> using lbzip2. None of this seemed obvious at first, but it rectified the
> situation. The biggest difference came from turning off hyperthreading
> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
> hyperthreading turned off (idle CPU = 0%).
>
> Previously, the compression was run from a single HDD, not an SSD.
> Concerning the compression test using the same HDD under FreeBSD, well I
> don't know why it was able to saturate the CPU. Perhaps it has something to
> do with ZFS's aggressive caching. Turning that off and re-running the test
> would likely answer the question. Pixz performed similarly when the above
> two modifications were made.
>
>
>
>
> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>
>> Have you tried to disable hypertreads in the BIOS ???
>> It's a long shot, I know, but it might help.
>>
>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>
>>> Hi,
>>>
>>> On both systems HAMMER was used. One small correction concerning the
>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>> report back if I find anything.
>>>
>>>
>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>
>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>> surprised if that made a difference, but it might?
>>>>
>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>> <ipc at peercorpstrust.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>> do not
>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>> does.
>>>>>
>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>> for
>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>> until
>>>>> compression is complete. Correspondingly, compression takes
>>>>> significantly
>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>> this
>>>>> case as opposed to erroneous reporting by top.
>>>>>
>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>> recent
>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>> EET
>>>>> 2016.
>>>>>
>>>>> Has anyone else possibly observed this?
>>>>>
>>>>> --
>>>>> Mike
>>>>>
>>>>>
>>>>
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161211/dbbb00e9/attachment-0001.htm>

From ipc at peercorpstrust.org  Sun Dec 11 22:14:29 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 12 Dec 2016 08:14:29 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>

Thanks for this.

Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on?

On 12/12/2016 01:37 AM, Matthew Dillon wrote:
> That doesn't make any sense.  It sounds like it is just compressing more
> slowly, so there is less idle time because the HDD/SSD is able to keep up
> due to it compressing more slowly.  You don't want to turn off
> hyperthreading in the BIOS and cache coherency stalls will not show up in
> the idle% anyway.
>
> -Matt
>
> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
> ipc at peercorpstrust.org> wrote:
>
>> Hi,
>>
>> It turns out that it was a combination of two things - turning off
>> hyperthreading in BIOS and using a faster disk.
>>
>> I found a post from the author of lbzip2 which seems to describe what
>> might be happening in this case, but reference was made to a user using an
>> i5 mobile CPU:
>>
>> ########################################################################
>>
>> "bzip2 author here. I strongly suspect that you see what you see because
>> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
>> quad-core. Meaning, you have two instances of the L2 per-core cache, not
>> four, and each two hyperthreads share an L2 cache.
>>
>> Since the bzip2 compression/decompression is very cache sensitive (see
>> "man bzip2"), the scaling factor will be determined mostly by how many
>> OS-threads can dispose over a dedicated cache each. In your case this
>> number is probably 2.
>>
>> Since you run two threads per core, those contend for the shared L2 cache,
>> basically each messing with the other (flushing / invalidating the shared
>> cache for the other). This contention shows up as double CPU time, because
>> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
>> time.
>>
>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>> export LBZIP2="-n 2". You should not run more worker threads per core than:
>> core-dedicated-cache-size divided by 8MB."
>>
>> ########################################################################
>>
>> Running the compression again on the same file from an SSD with
>> hyperthreading turned off, I was able to fully saturate all of the cores
>> using lbzip2. None of this seemed obvious at first, but it rectified the
>> situation. The biggest difference came from turning off hyperthreading
>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>> hyperthreading turned off (idle CPU = 0%).
>>
>> Previously, the compression was run from a single HDD, not an SSD.
>> Concerning the compression test using the same HDD under FreeBSD, well I
>> don't know why it was able to saturate the CPU. Perhaps it has something to
>> do with ZFS's aggressive caching. Turning that off and re-running the test
>> would likely answer the question. Pixz performed similarly when the above
>> two modifications were made.
>>
>>
>>
>>
>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>
>>> Have you tried to disable hypertreads in the BIOS ???
>>> It's a long shot, I know, but it might help.
>>>
>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>
>>>> Hi,
>>>>
>>>> On both systems HAMMER was used. One small correction concerning the
>>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>>> report back if I find anything.
>>>>
>>>>
>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>
>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>> surprised if that made a difference, but it might?
>>>>>
>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>>> do not
>>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>>> does.
>>>>>>
>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>>> for
>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>>> until
>>>>>> compression is complete. Correspondingly, compression takes
>>>>>> significantly
>>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>>> this
>>>>>> case as opposed to erroneous reporting by top.
>>>>>>
>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>> recent
>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>>> EET
>>>>>> 2016.
>>>>>>
>>>>>> Has anyone else possibly observed this?
>>>>>>
>>>>>> --
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>


From jasse at yberwaffe.com  Mon Dec 12 05:41:40 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Mon, 12 Dec 2016 14:41:40 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>

They recommend to turn hyperthreading off if you run studio software on 
your computer.
That's if you run Windows, I have no idea if HT affects a Unix 
derivative anyway.

On 2016-12-12 07:14, PeerCorps Trust Fund wrote:
> Thanks for this.
>
> Is there ever a side case where hyperthreading might have 
> unpredictable results or should it generally always be left on?
>
> On 12/12/2016 01:37 AM, Matthew Dillon wrote:
>> That doesn't make any sense.  It sounds like it is just compressing more
>> slowly, so there is less idle time because the HDD/SSD is able to 
>> keep up
>> due to it compressing more slowly.  You don't want to turn off
>> hyperthreading in the BIOS and cache coherency stalls will not show 
>> up in
>> the idle% anyway.
>>
>> -Matt
>>
>> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
>> ipc at peercorpstrust.org> wrote:
>>
>>> Hi,
>>>
>>> It turns out that it was a combination of two things - turning off
>>> hyperthreading in BIOS and using a faster disk.
>>>
>>> I found a post from the author of lbzip2 which seems to describe what
>>> might be happening in this case, but reference was made to a user 
>>> using an
>>> i5 mobile CPU:
>>>
>>> ######################################################################## 
>>>
>>>
>>> "bzip2 author here. I strongly suspect that you see what you see 
>>> because
>>> your Intel core i5 is probably only dual core PLUS hyper-threaded, 
>>> not real
>>> quad-core. Meaning, you have two instances of the L2 per-core cache, 
>>> not
>>> four, and each two hyperthreads share an L2 cache.
>>>
>>> Since the bzip2 compression/decompression is very cache sensitive (see
>>> "man bzip2"), the scaling factor will be determined mostly by how many
>>> OS-threads can dispose over a dedicated cache each. In your case this
>>> number is probably 2.
>>>
>>> Since you run two threads per core, those contend for the shared L2 
>>> cache,
>>> basically each messing with the other (flushing / invalidating the 
>>> shared
>>> cache for the other). This contention shows up as double CPU time, 
>>> because
>>> "waiting for cache" (or "waiting for main memory") is accounted for 
>>> as CPU
>>> time.
>>>
>>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>>> export LBZIP2="-n 2". You should not run more worker threads per 
>>> core than:
>>> core-dedicated-cache-size divided by 8MB."
>>>
>>> ######################################################################## 
>>>
>>>
>>> Running the compression again on the same file from an SSD with
>>> hyperthreading turned off, I was able to fully saturate all of the 
>>> cores
>>> using lbzip2. None of this seemed obvious at first, but it rectified 
>>> the
>>> situation. The biggest difference came from turning off hyperthreading
>>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>>> hyperthreading turned off (idle CPU = 0%).
>>>
>>> Previously, the compression was run from a single HDD, not an SSD.
>>> Concerning the compression test using the same HDD under FreeBSD, 
>>> well I
>>> don't know why it was able to saturate the CPU. Perhaps it has 
>>> something to
>>> do with ZFS's aggressive caching. Turning that off and re-running 
>>> the test
>>> would likely answer the question. Pixz performed similarly when the 
>>> above
>>> two modifications were made.
>>>
>>>
>>>
>>>
>>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>>
>>>> Have you tried to disable hypertreads in the BIOS ???
>>>> It's a long shot, I know, but it might help.
>>>>
>>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> On both systems HAMMER was used. One small correction concerning the
>>>>> 2c/2t machine, both compression programs did effectively utilize 
>>>>> that CPU
>>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t 
>>>>> where the CPU
>>>>> isn't effectively maxed out. I'll continue to try and investigate 
>>>>> why and
>>>>> report back if I find anything.
>>>>>
>>>>>
>>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>>
>>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>>> surprised if that made a difference, but it might?
>>>>>>
>>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've observed that parallel compression tools such as pixz and 
>>>>>>> lbzip2
>>>>>>> do not
>>>>>>> make use of all of the available CPU under Dragonfly. On other 
>>>>>>> OSes, it
>>>>>>> does.
>>>>>>>
>>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>>> percentages consistently hover around the 90% range for pixz and 
>>>>>>> ~70%
>>>>>>> for
>>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>>>>>> idle
>>>>>>> until
>>>>>>> compression is complete. Correspondingly, compression takes
>>>>>>> significantly
>>>>>>> longer under Dragonfly, so the CPU is really being under 
>>>>>>> utilized in
>>>>>>> this
>>>>>>> case as opposed to erroneous reporting by top.
>>>>>>>
>>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>>> recent
>>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>>>>>> 11:44:04
>>>>>>> EET
>>>>>>> 2016.
>>>>>>>
>>>>>>> Has anyone else possibly observed this?
>>>>>>>
>>>>>>> -- 
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>


From dillon at backplane.com  Tue Dec 13 12:56:41 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Tue, 13 Dec 2016 12:56:41 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
 <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
Message-ID: <CAOZ7CpDz_zKSr4OfQTQWcGApG=mNsqGv7gO4b90BJCzrohb5dg@mail.gmail.com>

Never turn hyper-threading off if running (any) BSD or Linux.  The
schedulers understand hyperthreads.

In terms of compression programs, presumably they have an option to specify
how many threads to create to do the compression.  Simply test with 1, 2,
and 4 threads to determine the optimal number.

-Matt
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161213/0eb2e7f6/attachment-0001.htm>

From dillon at backplane.com  Sat Dec 17 12:10:35 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:10:35 -0800
Subject: Graphics-related memory leak fixed in master for post-haswell machines
Message-ID: <CAOZ7CpDurD7AD-3E5sGnZBBCOPLt+0PPoDg=H6mV7EXivCT8Kw@mail.gmail.com>

A memory leak has been fixed in master's drm (gpu) code which affects
post-haswell intel GPUs.  i.e. Broadwell, Skylake, etc.  This leak would
cause a OOM panic typically after a couple of weeks of uptime.  The leak
does not affect older Haswell based intel systems.  A number of other
stability improvements have also gone into the system.

The release branch got some of the stability work but the fix for the
graphics memory leak is a bit too complex to backport in a compatible
manner.  At the current juncture, anyone running a workstation with release
might want to switch to master.  Master will be *much* better on broadwell
or later GPUs.

Also somewhat related, chromium (chrome) has been updated from v52 to v54
on both branches, which fixes some serious GUI issues on high-dpi (aka 4K)
monitors.  Chrome 54 still has some issues with certain sites, but
generally speaking it seems to be pretty stable and I recommend its use.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/1da04d0a/attachment-0001.htm>

From dillon at backplane.com  Sat Dec 17 12:19:31 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:19:31 -0800
Subject: Setting up a more secure browser environment
Message-ID: <CAOZ7CpAYYTK_QsAfQ4vEKT6BLSnBtm04+88XJ=NPemwoW187Pw@mail.gmail.com>

I have created a new page in the handbook describing how to set up a secure
browser environment using isolation accounts.  We recommend running your
browser this way instead of running it directly from your main workstation
account.

https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/

Currently chrome is considered to be the most secure.  Now that chrome 54
is in dports, it should generate a pretty good experience for users.  There
are some sites which still implode on it (and in FreeBSD too, that isn't a
DFly-specific problem), and for those you may have to run firefox.  My
personal preference is chrome with some adjustments to the setup.

For people running chrome on a 4K monitor, I recommend running chrome with
the "--force-device-scale-factor=1.5" option before you start messing with
font or other settings.  It will scale the UI a little closer to what you
probably want.

DragonFly master's support through Intel Skylake is now very good.  I am
running a Skylake box as my workstation now and it has been very stable.
Theoretically we have some Karby support in there too but no guarantees on
stability.  Radeon support has also seen some improvement though not at the
same pace as Intel GPU support has seen.

At this point, DragonFly runs extremely well even on mobile chipsets
(though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in
terms of performance).  I recommend a minimum of 4GB of ram if using X.
DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or
NVMe SSD and a large swap partition to page to for decent browser
performance.  Browsers are still big memory hogs.  Eventually we will have
chrome 55 support which is supposed to eat much less memory, but not yet.
 chrome 54 is where we are now.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/e050f279/attachment-0001.htm>

From karu.pruun at gmail.com  Mon Dec 19 05:21:56 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Mon, 19 Dec 2016 15:21:56 +0200
Subject: question about hammer mirror-stream
Message-ID: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>

Hello

A quick question regarding hammer mirror-streaming. The master and slave
are mounted as

# mount_null /pfs/master /home/master
# mount_null /pfs/slave /home/slave

and streaming is ongoing with

# hammer mirror-stream /home/master /home/slave

When there's a change in master, e.g.

% sudo touch /home/master/fileX

then fileX will show up in /pfs/slave but not in /home/slave. It will show
up in the latter when I stop streaming, unmount the slave and mount it
again. Is this the intended behavior?

Thanks

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/e2caa040/attachment-0001.htm>

From t+dfbsd at timdarby.net  Mon Dec 19 08:50:06 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Mon, 19 Dec 2016 09:50:06 -0700
Subject: question about hammer mirror-stream
In-Reply-To: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
Message-ID: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:

> then fileX will show up in /pfs/slave but not in /home/slave. It will show
> up in the latter when I stop streaming, unmount the slave and mount it
> again. Is this the intended behavior?
>

Yes, that's how it works. You won't see the updates to the mounted slave
until you unmount and re-mount.

Tim?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/07448b6c/attachment-0001.htm>

From dillon at backplane.com  Mon Dec 19 10:35:29 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Mon, 19 Dec 2016 10:35:29 -0800
Subject: AHCI driver on master - heads up
Message-ID: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>

Some work has gone into the AHCI driver on master.  It should not affect
anyone, but I'm posting a heads-up because this is the primary SATA storage
device used on most machines and the changes made are significant enough
that they could conceivably cause an issue.  So if you are using master, be
sure you have a good backup kernel in case something goes wrong.

The easiest way to create a backup kernel is to first install the new
kernel.  Your previous kernel will be stripped and copied to
/boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:

cpdup /boot/kernel.old /boot/kernel.bak

This way you don't waste space in /boot copying kernels and modules with
full debug symbols.  Having a kernel.bak is a good idea because it will
never be overwritten by an installkernel or other build target.  It will
show up in the loader under 'b' for backup kernel.

--

Experimental FIS-Based Switching support has been added to the AHCI
driver's port-multiplier code.  Most AHCI chipsets do not have this
capability and the few that do tend to have fairly buggy hardware, so I am
not recommending that port multipliers be used generally.  However, after
years of waiting I finally got my hands on a mobo that had the feature and
I've wanted to implement it forever, so I have done so.

The FBS feature allows the computer to queue concurrent commands to
multiple targets behind a port-multiplier.  Without it, only one target can
be addressed at a time.  Test results with some old HDDs I had on a shelf
can be found here.  As expected, concurrent performance to multiple targets
is vastly improved.

http://apollo.backplane.com/DFlyMisc/portmult01.txt

However, I must again stress that AHCI-based port-multipliers tend to
fall-over when errors occur, or even if you are just hot-swapping a drive
sometimes.  Using a port-multiplier is not a good solution for serious
setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
in order to prevent it from competing with commercial SAS controllers.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/ba061b2e/attachment-0001.htm>

From ipc at peercorpstrust.org  Mon Dec 19 11:14:07 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 19 Dec 2016 21:14:07 +0200
Subject: AHCI driver on master - heads up
In-Reply-To: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
References: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org>

This is interesting.

AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you!


On 12/19/2016 08:35 PM, Matthew Dillon wrote:
> Some work has gone into the AHCI driver on master.  It should not affect
> anyone, but I'm posting a heads-up because this is the primary SATA storage
> device used on most machines and the changes made are significant enough
> that they could conceivably cause an issue.  So if you are using master, be
> sure you have a good backup kernel in case something goes wrong.
>
> The easiest way to create a backup kernel is to first install the new
> kernel.  Your previous kernel will be stripped and copied to
> /boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:
>
> cpdup /boot/kernel.old /boot/kernel.bak
>
> This way you don't waste space in /boot copying kernels and modules with
> full debug symbols.  Having a kernel.bak is a good idea because it will
> never be overwritten by an installkernel or other build target.  It will
> show up in the loader under 'b' for backup kernel.
>
> --
>
> Experimental FIS-Based Switching support has been added to the AHCI
> driver's port-multiplier code.  Most AHCI chipsets do not have this
> capability and the few that do tend to have fairly buggy hardware, so I am
> not recommending that port multipliers be used generally.  However, after
> years of waiting I finally got my hands on a mobo that had the feature and
> I've wanted to implement it forever, so I have done so.
>
> The FBS feature allows the computer to queue concurrent commands to
> multiple targets behind a port-multiplier.  Without it, only one target can
> be addressed at a time.  Test results with some old HDDs I had on a shelf
> can be found here.  As expected, concurrent performance to multiple targets
> is vastly improved.
>
> http://apollo.backplane.com/DFlyMisc/portmult01.txt
>
> However, I must again stress that AHCI-based port-multipliers tend to
> fall-over when errors occur, or even if you are just hot-swapping a drive
> sometimes.  Using a port-multiplier is not a good solution for serious
> setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
> in order to prevent it from competing with commercial SAS controllers.
>
> -Matt
>


From karu.pruun at gmail.com  Thu Dec 22 05:11:17 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Thu, 22 Dec 2016 15:11:17 +0200
Subject: question about hammer mirror-stream
In-Reply-To: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
 <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
Message-ID: <CADdF=MJsa-aDAGW6p2Fbx_ryD0TanKXASmUANDjXG5hVrEsc8w@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby <t+dfbsd at timdarby.net> wrote:

> On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:
>
>> then fileX will show up in /pfs/slave but not in /home/slave. It will
>> show up in the latter when I stop streaming, unmount the slave and mount it
>> again. Is this the intended behavior?
>>
>
> Yes, that's how it works. You won't see the updates to the mounted slave
> until you unmount and re-mount.
>


Thanks!

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161222/58827d29/attachment-0001.htm>

From vincent.delft at gmail.com  Mon Dec 26 03:30:12 2016
From: vincent.delft at gmail.com (vincent delft)
Date: Mon, 26 Dec 2016 12:30:12 +0100
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
Message-ID: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>

Hello,

First of all I'm new to DF.
Thanks for the nice install process. It's a quite easy to have DF running.
But my concerns are more linked to HammerFS.

My context is the following:
- I would like to have a NAS system running on a small machine: 4GB RAM
having a celeron CPU of 1.6GHz.
- This NAS host +- 700GB of data
- I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
outside the NAS for DRP reasons.
- I'm looking for a solution tackling the bit-rot problem.


Concerning HammerFS, I've still read lot of DF manuals, pages, ... :

http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
https://www.dragonflybsd.org/hammer/

So basically, I've understood that I must install DF on a small SSD disk
and will dedicated the 1TB disk to HammerFS.

But I have some questions:
- What must be the setup of HammerFS to solve "bit-rot" ? Should it be
Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
should I create 2 slices on the disk?

- For DRP reasons, how can I perform a backup on an external disk ? (the
disk is only connected 1x per month on the NAS). Should I do a "dd" ?
should I use the master-slave concepts ? Should I use cpdup ?

Many thanks for your replies.

Vincent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161226/81f21ada/attachment-0001.htm>

From kusumi.tomohiro at gmail.com  Mon Dec 26 04:15:08 2016
From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi)
Date: Mon, 26 Dec 2016 21:15:08 +0900
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <CAHndrBnCMCQ+3vb4HP2Ey+JEMfE_eGvZOKJ8UuObLA2rfg7B8A@mail.gmail.com>

In short, hammer doesn't solve (or fix) what you call bit-rot.


2016-12-26 20:30 GMT+09:00 vincent delft <vincent.delft at gmail.com>:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk and
> will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1
> or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ?
> Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2
> slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ? should
> I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>
>


From ipc at peercorpstrust.org  Mon Dec 26 05:06:24 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 26 Dec 2016 15:06:24 +0200
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org>

Hi Vincent,

Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine.

Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on.

HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication.

Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company:
https://bsdmag.org/siju_george/

HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs.

There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer

You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/

In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also.

If any of the above is in error, I imagine someone with deeper insight will chime in.

Mike

On 12/26/2016 01:30 PM, vincent delft wrote:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk
> and will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be
> Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
> filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
> should I create 2 slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ?
> should I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>


From dillon at backplane.com  Wed Dec 28 14:32:55 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Wed, 28 Dec 2016 14:32:55 -0800
Subject: Swap changes in master, enforcement of RLIMIT_RSS
Message-ID: <CAOZ7CpC5T=OeVR7-FP8f09D2giPEBde4fxboC4vemRisMxdgew@mail.gmail.com>

The maximum supported swap space has been significantly increased in
master.  The amount of core memory required to manage swap is approximately
1:128 (ram:swap-used).  The system allows up to half of your core ram to be
used for swap management so the maximum swap you can actually use on a
machine would be as follows:

With 1GB of ram it would be 512M*128 = 64GB of swap.

With 8GB of ram it would be 4G*128 = 512GB of swap.

With 128GB of ram it would be around ~8TB of swap.

The maximum the system supports due to KVM limitations is around ~32TB of
swap.

While you can configure more swap than the above limitations, the system
might not be able to actually manage as much as you configure.  I'm
considering improving the management code to reduce its memory
requirements.  Also, people shouldn't go overboard w/regards to configuring
swap space.  Configured swap does eat some memory resources even when not
actively in-use, and there might not be a practical use for having such
massive amounts of swap with your setup.

--

The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced
on a per-process basis.  By default this limit is infinity so nothing will
change for most people.  Setting a memoryuse limit has advantages in
certain circumstances as long as it isn't over-used.  For example, it might
be reasonable to start your browser with a 500m (500 megabyte) limit.  The
system will allow each process VSZ to grow larger, but will force the RSS
to the limit and devalue the pages it removes to make them more likely to
be recycled by the system.

There is a sysctl, vm.pageout_memuse_mode, to control the behavior.  It can
be set to 0, 1, or 2, and defaults to 1.  Setting it to 0 disables the
enforcement (system operates the same as it did before the commit).  The
default value of 1 enables enforcement by removing pages from the process
pmap and deactivating them passively.  The value 2 enables enforcement and
actively pages data out and frees pages.   Mode 2 can cause excessive
paging and wear out your SSD so it is not currently recommended.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161228/4ff8510c/attachment-0001.htm>

From t+dfbsd at timdarby.net  Thu Dec  1 08:27:40 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Thu, 1 Dec 2016 09:27:40 -0700
Subject: Notice to Golang users
Message-ID: <CAP3DVK+RYA_MjvSnLwwXD1FjsKbi1dJkipLCPS+kJXUTf-v08Q@mail.gmail.com>

If you use the Go compiler with Dragonfly (it's in dports), be aware that
the upcoming release of Go 1.8 will require DF 4.4.4 or later:

https://beta.golang.org/doc/go1.8

?The Go team had put a workaround in place for DF for a thread signal
handling bug. This was recently fixed by Matt, so they've removed the
workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you
can use Go 1.7 or earlier.

Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161201/6aa509a7/attachment-0002.htm>

From venture37 at geeklan.co.uk  Sun Dec  4 07:56:48 2016
From: venture37 at geeklan.co.uk (Sevan Janiyan)
Date: Sun, 4 Dec 2016 15:56:48 +0000
Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?=
 =?UTF-8?Q?icipation_extended?=
Message-ID: <e0ee874b-d0a4-ff74-210c-4a913f653227@geeklan.co.uk>


Hi,

The BSD devroom move the submission deadline to Dec 10th.
There's still time to submit a talk !

Regards,
- rodrigo
On behalf of the BSD devroom

Fosdem 2017 BSD devroom Call for Participation
==============================================

Important dates
--------------------
* Conference date      : 4 & 5 February 2017 in Brussels, Belgium
* Devroom date         : Sunday 5 February 2016
* Submission deadline  : Sunday 27 November 2016
* Speaker notified     : Sunday 4 December 2016

The topic of the devroom includes all BSD operating systems and every
talk is welcome from hacker discussions to real-word examples and
presentations about new and shiny features.

Practical
--------------------
* The default duration for talks will be 45 minutes including
   discussion. Feel free to ask if you want to have a longer or
   a shorter slot.

* Presentations can be recorded and streamed, sending your proposal
   implies giving permission to be recorded. However, exceptions can
   be made for exceptional circumstances.


To submit your proposal, visit :
https://penta.fosdem.org/submission/FOSDEM17

Account creation
----------------------
If you already have a Pentabarf account, please *don't* recreate a new
one. If you forgot your password, reset it. If not, follow the
instructions to create an account.

Submit a talk
----------------------
Create an ?event? and click on "Show all" in the top right corner
to display the full form.

Your submission must include the following information

* The title and subtitle of your talk (please be descriptive, as titles
   will be listed with ~500 from other projects)
* select ?BSD devroom? as the track.
* A short abstract of one paragraph
* A longer description if you wish to do so
* Links to related websites/blogs etc.

- Rodrigo Osorio
On behalf of the BSD Devroom


From ipc at peercorpstrust.org  Sat Dec 10 03:19:14 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 13:19:14 +0200
Subject: Parallel compression not using all available CPU
Message-ID: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>

Hi,

I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does.

When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top.

This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET 2016.

Has anyone else possibly observed this?

-- 
Mike


From justin at shiningsilence.com  Sat Dec 10 12:26:32 2016
From: justin at shiningsilence.com (Justin Sherrill)
Date: Sat, 10 Dec 2016 15:26:32 -0500
Subject: Parallel compression not using all available CPU
In-Reply-To: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
Message-ID: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>

On the two DragonFly systems, was it Hammer or UFS?  I would be
surprised if that made a difference, but it might?

On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
<ipc at peercorpstrust.org> wrote:
> Hi,
>
> I've observed that parallel compression tools such as pixz and lbzip2 do not
> make use of all of the available CPU under Dragonfly. On other OSes, it
> does.
>
> When testing on a 50 gb file, using top I've observed that CPU idle
> percentages consistently hover around the 90% range for pixz and ~70% for
> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
> compression is complete. Correspondingly, compression takes significantly
> longer under Dragonfly, so the CPU is really being under utilized in this
> case as opposed to erroneous reporting by top.
>
> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
> 2016.
>
> Has anyone else possibly observed this?
>
> --
> Mike
>


From ipc at peercorpstrust.org  Sat Dec 10 13:14:26 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 23:14:26 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>

Hi,

On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.


On 12/10/2016 10:26 PM, Justin Sherrill wrote:
> On the two DragonFly systems, was it Hammer or UFS?  I would be
> surprised if that made a difference, but it might?
>
> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
> <ipc at peercorpstrust.org> wrote:
>> Hi,
>>
>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>> make use of all of the available CPU under Dragonfly. On other OSes, it
>> does.
>>
>> When testing on a 50 gb file, using top I've observed that CPU idle
>> percentages consistently hover around the 90% range for pixz and ~70% for
>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>> compression is complete. Correspondingly, compression takes significantly
>> longer under Dragonfly, so the CPU is really being under utilized in this
>> case as opposed to erroneous reporting by top.
>>
>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>> 2016.
>>
>> Has anyone else possibly observed this?
>>
>> --
>> Mike
>>
>


From dillon at apollo.backplane.com  Sat Dec 10 13:29:24 2016
From: dillon at apollo.backplane.com (Matthew Dillon)
Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST)
Subject: Parallel compression not using all available CPU
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com>


:Hi,
:
:On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.

    Generally speaking I wouldn't want these utilities to use all available
    cpu threads by default.  It's not usually a good idea for a program to
    just assume that all the cpus are there for it (by default).  But they
    should have command line options that allow the number of threads to
    be specified.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


From jasse at yberwaffe.com  Sat Dec 10 16:00:02 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Sun, 11 Dec 2016 01:00:02 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>

Have you tried to disable hypertreads in the BIOS ???
It's a long shot, I know, but it might help.

On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
> Hi,
>
> On both systems HAMMER was used. One small correction concerning the 
> 2c/2t machine, both compression programs did effectively utilize that 
> CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t 
> where the CPU isn't effectively maxed out. I'll continue to try and 
> investigate why and report back if I find anything.
>
>
> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>> surprised if that made a difference, but it might?
>>
>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>> <ipc at peercorpstrust.org> wrote:
>>> Hi,
>>>
>>> I've observed that parallel compression tools such as pixz and 
>>> lbzip2 do not
>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>> does.
>>>
>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>> percentages consistently hover around the 90% range for pixz and 
>>> ~70% for
>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>> idle until
>>> compression is complete. Correspondingly, compression takes 
>>> significantly
>>> longer under Dragonfly, so the CPU is really being under utilized in 
>>> this
>>> case as opposed to erroneous reporting by top.
>>>
>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a 
>>> recent
>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>> 11:44:04 EET
>>> 2016.
>>>
>>> Has anyone else possibly observed this?
>>>
>>> -- 
>>> Mike
>>>
>>
>


From ipc at peercorpstrust.org  Sun Dec 11 13:22:44 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sun, 11 Dec 2016 23:22:44 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>

Hi,

It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk.

I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU:

########################################################################

"bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache.

Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2.

Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time.

Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB."

########################################################################

Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%).

Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made.


On 12/11/2016 02:00 AM, Jasse Jansson wrote:
> Have you tried to disable hypertreads in the BIOS ???
> It's a long shot, I know, but it might help.
>
> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>> Hi,
>>
>> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.
>>
>>
>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>> surprised if that made a difference, but it might?
>>>
>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>> <ipc at peercorpstrust.org> wrote:
>>>> Hi,
>>>>
>>>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>> does.
>>>>
>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>> percentages consistently hover around the 90% range for pixz and ~70% for
>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>>>> compression is complete. Correspondingly, compression takes significantly
>>>> longer under Dragonfly, so the CPU is really being under utilized in this
>>>> case as opposed to erroneous reporting by top.
>>>>
>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>>>> 2016.
>>>>
>>>> Has anyone else possibly observed this?
>>>>
>>>> --
>>>> Mike
>>>>
>>>
>>
>
>


From dillon at backplane.com  Sun Dec 11 15:37:36 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sun, 11 Dec 2016 15:37:36 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
Message-ID: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>

That doesn't make any sense.  It sounds like it is just compressing more
slowly, so there is less idle time because the HDD/SSD is able to keep up
due to it compressing more slowly.  You don't want to turn off
hyperthreading in the BIOS and cache coherency stalls will not show up in
the idle% anyway.

-Matt

On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
ipc at peercorpstrust.org> wrote:

> Hi,
>
> It turns out that it was a combination of two things - turning off
> hyperthreading in BIOS and using a faster disk.
>
> I found a post from the author of lbzip2 which seems to describe what
> might be happening in this case, but reference was made to a user using an
> i5 mobile CPU:
>
> ########################################################################
>
> "bzip2 author here. I strongly suspect that you see what you see because
> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
> quad-core. Meaning, you have two instances of the L2 per-core cache, not
> four, and each two hyperthreads share an L2 cache.
>
> Since the bzip2 compression/decompression is very cache sensitive (see
> "man bzip2"), the scaling factor will be determined mostly by how many
> OS-threads can dispose over a dedicated cache each. In your case this
> number is probably 2.
>
> Since you run two threads per core, those contend for the shared L2 cache,
> basically each messing with the other (flushing / invalidating the shared
> cache for the other). This contention shows up as double CPU time, because
> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
> time.
>
> Hyperthreading is not useful but detrimental for lbzip2; so you should
> export LBZIP2="-n 2". You should not run more worker threads per core than:
> core-dedicated-cache-size divided by 8MB."
>
> ########################################################################
>
> Running the compression again on the same file from an SSD with
> hyperthreading turned off, I was able to fully saturate all of the cores
> using lbzip2. None of this seemed obvious at first, but it rectified the
> situation. The biggest difference came from turning off hyperthreading
> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
> hyperthreading turned off (idle CPU = 0%).
>
> Previously, the compression was run from a single HDD, not an SSD.
> Concerning the compression test using the same HDD under FreeBSD, well I
> don't know why it was able to saturate the CPU. Perhaps it has something to
> do with ZFS's aggressive caching. Turning that off and re-running the test
> would likely answer the question. Pixz performed similarly when the above
> two modifications were made.
>
>
>
>
> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>
>> Have you tried to disable hypertreads in the BIOS ???
>> It's a long shot, I know, but it might help.
>>
>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>
>>> Hi,
>>>
>>> On both systems HAMMER was used. One small correction concerning the
>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>> report back if I find anything.
>>>
>>>
>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>
>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>> surprised if that made a difference, but it might?
>>>>
>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>> <ipc at peercorpstrust.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>> do not
>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>> does.
>>>>>
>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>> for
>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>> until
>>>>> compression is complete. Correspondingly, compression takes
>>>>> significantly
>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>> this
>>>>> case as opposed to erroneous reporting by top.
>>>>>
>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>> recent
>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>> EET
>>>>> 2016.
>>>>>
>>>>> Has anyone else possibly observed this?
>>>>>
>>>>> --
>>>>> Mike
>>>>>
>>>>>
>>>>
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161211/dbbb00e9/attachment-0002.htm>

From ipc at peercorpstrust.org  Sun Dec 11 22:14:29 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 12 Dec 2016 08:14:29 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>

Thanks for this.

Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on?

On 12/12/2016 01:37 AM, Matthew Dillon wrote:
> That doesn't make any sense.  It sounds like it is just compressing more
> slowly, so there is less idle time because the HDD/SSD is able to keep up
> due to it compressing more slowly.  You don't want to turn off
> hyperthreading in the BIOS and cache coherency stalls will not show up in
> the idle% anyway.
>
> -Matt
>
> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
> ipc at peercorpstrust.org> wrote:
>
>> Hi,
>>
>> It turns out that it was a combination of two things - turning off
>> hyperthreading in BIOS and using a faster disk.
>>
>> I found a post from the author of lbzip2 which seems to describe what
>> might be happening in this case, but reference was made to a user using an
>> i5 mobile CPU:
>>
>> ########################################################################
>>
>> "bzip2 author here. I strongly suspect that you see what you see because
>> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
>> quad-core. Meaning, you have two instances of the L2 per-core cache, not
>> four, and each two hyperthreads share an L2 cache.
>>
>> Since the bzip2 compression/decompression is very cache sensitive (see
>> "man bzip2"), the scaling factor will be determined mostly by how many
>> OS-threads can dispose over a dedicated cache each. In your case this
>> number is probably 2.
>>
>> Since you run two threads per core, those contend for the shared L2 cache,
>> basically each messing with the other (flushing / invalidating the shared
>> cache for the other). This contention shows up as double CPU time, because
>> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
>> time.
>>
>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>> export LBZIP2="-n 2". You should not run more worker threads per core than:
>> core-dedicated-cache-size divided by 8MB."
>>
>> ########################################################################
>>
>> Running the compression again on the same file from an SSD with
>> hyperthreading turned off, I was able to fully saturate all of the cores
>> using lbzip2. None of this seemed obvious at first, but it rectified the
>> situation. The biggest difference came from turning off hyperthreading
>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>> hyperthreading turned off (idle CPU = 0%).
>>
>> Previously, the compression was run from a single HDD, not an SSD.
>> Concerning the compression test using the same HDD under FreeBSD, well I
>> don't know why it was able to saturate the CPU. Perhaps it has something to
>> do with ZFS's aggressive caching. Turning that off and re-running the test
>> would likely answer the question. Pixz performed similarly when the above
>> two modifications were made.
>>
>>
>>
>>
>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>
>>> Have you tried to disable hypertreads in the BIOS ???
>>> It's a long shot, I know, but it might help.
>>>
>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>
>>>> Hi,
>>>>
>>>> On both systems HAMMER was used. One small correction concerning the
>>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>>> report back if I find anything.
>>>>
>>>>
>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>
>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>> surprised if that made a difference, but it might?
>>>>>
>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>>> do not
>>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>>> does.
>>>>>>
>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>>> for
>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>>> until
>>>>>> compression is complete. Correspondingly, compression takes
>>>>>> significantly
>>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>>> this
>>>>>> case as opposed to erroneous reporting by top.
>>>>>>
>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>> recent
>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>>> EET
>>>>>> 2016.
>>>>>>
>>>>>> Has anyone else possibly observed this?
>>>>>>
>>>>>> --
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>


From jasse at yberwaffe.com  Mon Dec 12 05:41:40 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Mon, 12 Dec 2016 14:41:40 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>

They recommend to turn hyperthreading off if you run studio software on 
your computer.
That's if you run Windows, I have no idea if HT affects a Unix 
derivative anyway.

On 2016-12-12 07:14, PeerCorps Trust Fund wrote:
> Thanks for this.
>
> Is there ever a side case where hyperthreading might have 
> unpredictable results or should it generally always be left on?
>
> On 12/12/2016 01:37 AM, Matthew Dillon wrote:
>> That doesn't make any sense.  It sounds like it is just compressing more
>> slowly, so there is less idle time because the HDD/SSD is able to 
>> keep up
>> due to it compressing more slowly.  You don't want to turn off
>> hyperthreading in the BIOS and cache coherency stalls will not show 
>> up in
>> the idle% anyway.
>>
>> -Matt
>>
>> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
>> ipc at peercorpstrust.org> wrote:
>>
>>> Hi,
>>>
>>> It turns out that it was a combination of two things - turning off
>>> hyperthreading in BIOS and using a faster disk.
>>>
>>> I found a post from the author of lbzip2 which seems to describe what
>>> might be happening in this case, but reference was made to a user 
>>> using an
>>> i5 mobile CPU:
>>>
>>> ######################################################################## 
>>>
>>>
>>> "bzip2 author here. I strongly suspect that you see what you see 
>>> because
>>> your Intel core i5 is probably only dual core PLUS hyper-threaded, 
>>> not real
>>> quad-core. Meaning, you have two instances of the L2 per-core cache, 
>>> not
>>> four, and each two hyperthreads share an L2 cache.
>>>
>>> Since the bzip2 compression/decompression is very cache sensitive (see
>>> "man bzip2"), the scaling factor will be determined mostly by how many
>>> OS-threads can dispose over a dedicated cache each. In your case this
>>> number is probably 2.
>>>
>>> Since you run two threads per core, those contend for the shared L2 
>>> cache,
>>> basically each messing with the other (flushing / invalidating the 
>>> shared
>>> cache for the other). This contention shows up as double CPU time, 
>>> because
>>> "waiting for cache" (or "waiting for main memory") is accounted for 
>>> as CPU
>>> time.
>>>
>>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>>> export LBZIP2="-n 2". You should not run more worker threads per 
>>> core than:
>>> core-dedicated-cache-size divided by 8MB."
>>>
>>> ######################################################################## 
>>>
>>>
>>> Running the compression again on the same file from an SSD with
>>> hyperthreading turned off, I was able to fully saturate all of the 
>>> cores
>>> using lbzip2. None of this seemed obvious at first, but it rectified 
>>> the
>>> situation. The biggest difference came from turning off hyperthreading
>>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>>> hyperthreading turned off (idle CPU = 0%).
>>>
>>> Previously, the compression was run from a single HDD, not an SSD.
>>> Concerning the compression test using the same HDD under FreeBSD, 
>>> well I
>>> don't know why it was able to saturate the CPU. Perhaps it has 
>>> something to
>>> do with ZFS's aggressive caching. Turning that off and re-running 
>>> the test
>>> would likely answer the question. Pixz performed similarly when the 
>>> above
>>> two modifications were made.
>>>
>>>
>>>
>>>
>>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>>
>>>> Have you tried to disable hypertreads in the BIOS ???
>>>> It's a long shot, I know, but it might help.
>>>>
>>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> On both systems HAMMER was used. One small correction concerning the
>>>>> 2c/2t machine, both compression programs did effectively utilize 
>>>>> that CPU
>>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t 
>>>>> where the CPU
>>>>> isn't effectively maxed out. I'll continue to try and investigate 
>>>>> why and
>>>>> report back if I find anything.
>>>>>
>>>>>
>>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>>
>>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>>> surprised if that made a difference, but it might?
>>>>>>
>>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've observed that parallel compression tools such as pixz and 
>>>>>>> lbzip2
>>>>>>> do not
>>>>>>> make use of all of the available CPU under Dragonfly. On other 
>>>>>>> OSes, it
>>>>>>> does.
>>>>>>>
>>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>>> percentages consistently hover around the 90% range for pixz and 
>>>>>>> ~70%
>>>>>>> for
>>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>>>>>> idle
>>>>>>> until
>>>>>>> compression is complete. Correspondingly, compression takes
>>>>>>> significantly
>>>>>>> longer under Dragonfly, so the CPU is really being under 
>>>>>>> utilized in
>>>>>>> this
>>>>>>> case as opposed to erroneous reporting by top.
>>>>>>>
>>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>>> recent
>>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>>>>>> 11:44:04
>>>>>>> EET
>>>>>>> 2016.
>>>>>>>
>>>>>>> Has anyone else possibly observed this?
>>>>>>>
>>>>>>> -- 
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>


From dillon at backplane.com  Tue Dec 13 12:56:41 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Tue, 13 Dec 2016 12:56:41 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
 <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
Message-ID: <CAOZ7CpDz_zKSr4OfQTQWcGApG=mNsqGv7gO4b90BJCzrohb5dg@mail.gmail.com>

Never turn hyper-threading off if running (any) BSD or Linux.  The
schedulers understand hyperthreads.

In terms of compression programs, presumably they have an option to specify
how many threads to create to do the compression.  Simply test with 1, 2,
and 4 threads to determine the optimal number.

-Matt
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161213/0eb2e7f6/attachment-0002.htm>

From dillon at backplane.com  Sat Dec 17 12:10:35 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:10:35 -0800
Subject: Graphics-related memory leak fixed in master for post-haswell machines
Message-ID: <CAOZ7CpDurD7AD-3E5sGnZBBCOPLt+0PPoDg=H6mV7EXivCT8Kw@mail.gmail.com>

A memory leak has been fixed in master's drm (gpu) code which affects
post-haswell intel GPUs.  i.e. Broadwell, Skylake, etc.  This leak would
cause a OOM panic typically after a couple of weeks of uptime.  The leak
does not affect older Haswell based intel systems.  A number of other
stability improvements have also gone into the system.

The release branch got some of the stability work but the fix for the
graphics memory leak is a bit too complex to backport in a compatible
manner.  At the current juncture, anyone running a workstation with release
might want to switch to master.  Master will be *much* better on broadwell
or later GPUs.

Also somewhat related, chromium (chrome) has been updated from v52 to v54
on both branches, which fixes some serious GUI issues on high-dpi (aka 4K)
monitors.  Chrome 54 still has some issues with certain sites, but
generally speaking it seems to be pretty stable and I recommend its use.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/1da04d0a/attachment-0002.htm>

From dillon at backplane.com  Sat Dec 17 12:19:31 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:19:31 -0800
Subject: Setting up a more secure browser environment
Message-ID: <CAOZ7CpAYYTK_QsAfQ4vEKT6BLSnBtm04+88XJ=NPemwoW187Pw@mail.gmail.com>

I have created a new page in the handbook describing how to set up a secure
browser environment using isolation accounts.  We recommend running your
browser this way instead of running it directly from your main workstation
account.

https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/

Currently chrome is considered to be the most secure.  Now that chrome 54
is in dports, it should generate a pretty good experience for users.  There
are some sites which still implode on it (and in FreeBSD too, that isn't a
DFly-specific problem), and for those you may have to run firefox.  My
personal preference is chrome with some adjustments to the setup.

For people running chrome on a 4K monitor, I recommend running chrome with
the "--force-device-scale-factor=1.5" option before you start messing with
font or other settings.  It will scale the UI a little closer to what you
probably want.

DragonFly master's support through Intel Skylake is now very good.  I am
running a Skylake box as my workstation now and it has been very stable.
Theoretically we have some Karby support in there too but no guarantees on
stability.  Radeon support has also seen some improvement though not at the
same pace as Intel GPU support has seen.

At this point, DragonFly runs extremely well even on mobile chipsets
(though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in
terms of performance).  I recommend a minimum of 4GB of ram if using X.
DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or
NVMe SSD and a large swap partition to page to for decent browser
performance.  Browsers are still big memory hogs.  Eventually we will have
chrome 55 support which is supposed to eat much less memory, but not yet.
 chrome 54 is where we are now.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/e050f279/attachment-0002.htm>

From karu.pruun at gmail.com  Mon Dec 19 05:21:56 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Mon, 19 Dec 2016 15:21:56 +0200
Subject: question about hammer mirror-stream
Message-ID: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>

Hello

A quick question regarding hammer mirror-streaming. The master and slave
are mounted as

# mount_null /pfs/master /home/master
# mount_null /pfs/slave /home/slave

and streaming is ongoing with

# hammer mirror-stream /home/master /home/slave

When there's a change in master, e.g.

% sudo touch /home/master/fileX

then fileX will show up in /pfs/slave but not in /home/slave. It will show
up in the latter when I stop streaming, unmount the slave and mount it
again. Is this the intended behavior?

Thanks

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/e2caa040/attachment-0002.htm>

From t+dfbsd at timdarby.net  Mon Dec 19 08:50:06 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Mon, 19 Dec 2016 09:50:06 -0700
Subject: question about hammer mirror-stream
In-Reply-To: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
Message-ID: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:

> then fileX will show up in /pfs/slave but not in /home/slave. It will show
> up in the latter when I stop streaming, unmount the slave and mount it
> again. Is this the intended behavior?
>

Yes, that's how it works. You won't see the updates to the mounted slave
until you unmount and re-mount.

Tim?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/07448b6c/attachment-0002.htm>

From dillon at backplane.com  Mon Dec 19 10:35:29 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Mon, 19 Dec 2016 10:35:29 -0800
Subject: AHCI driver on master - heads up
Message-ID: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>

Some work has gone into the AHCI driver on master.  It should not affect
anyone, but I'm posting a heads-up because this is the primary SATA storage
device used on most machines and the changes made are significant enough
that they could conceivably cause an issue.  So if you are using master, be
sure you have a good backup kernel in case something goes wrong.

The easiest way to create a backup kernel is to first install the new
kernel.  Your previous kernel will be stripped and copied to
/boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:

cpdup /boot/kernel.old /boot/kernel.bak

This way you don't waste space in /boot copying kernels and modules with
full debug symbols.  Having a kernel.bak is a good idea because it will
never be overwritten by an installkernel or other build target.  It will
show up in the loader under 'b' for backup kernel.

--

Experimental FIS-Based Switching support has been added to the AHCI
driver's port-multiplier code.  Most AHCI chipsets do not have this
capability and the few that do tend to have fairly buggy hardware, so I am
not recommending that port multipliers be used generally.  However, after
years of waiting I finally got my hands on a mobo that had the feature and
I've wanted to implement it forever, so I have done so.

The FBS feature allows the computer to queue concurrent commands to
multiple targets behind a port-multiplier.  Without it, only one target can
be addressed at a time.  Test results with some old HDDs I had on a shelf
can be found here.  As expected, concurrent performance to multiple targets
is vastly improved.

http://apollo.backplane.com/DFlyMisc/portmult01.txt

However, I must again stress that AHCI-based port-multipliers tend to
fall-over when errors occur, or even if you are just hot-swapping a drive
sometimes.  Using a port-multiplier is not a good solution for serious
setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
in order to prevent it from competing with commercial SAS controllers.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/ba061b2e/attachment-0002.htm>

From ipc at peercorpstrust.org  Mon Dec 19 11:14:07 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 19 Dec 2016 21:14:07 +0200
Subject: AHCI driver on master - heads up
In-Reply-To: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
References: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org>

This is interesting.

AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you!


On 12/19/2016 08:35 PM, Matthew Dillon wrote:
> Some work has gone into the AHCI driver on master.  It should not affect
> anyone, but I'm posting a heads-up because this is the primary SATA storage
> device used on most machines and the changes made are significant enough
> that they could conceivably cause an issue.  So if you are using master, be
> sure you have a good backup kernel in case something goes wrong.
>
> The easiest way to create a backup kernel is to first install the new
> kernel.  Your previous kernel will be stripped and copied to
> /boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:
>
> cpdup /boot/kernel.old /boot/kernel.bak
>
> This way you don't waste space in /boot copying kernels and modules with
> full debug symbols.  Having a kernel.bak is a good idea because it will
> never be overwritten by an installkernel or other build target.  It will
> show up in the loader under 'b' for backup kernel.
>
> --
>
> Experimental FIS-Based Switching support has been added to the AHCI
> driver's port-multiplier code.  Most AHCI chipsets do not have this
> capability and the few that do tend to have fairly buggy hardware, so I am
> not recommending that port multipliers be used generally.  However, after
> years of waiting I finally got my hands on a mobo that had the feature and
> I've wanted to implement it forever, so I have done so.
>
> The FBS feature allows the computer to queue concurrent commands to
> multiple targets behind a port-multiplier.  Without it, only one target can
> be addressed at a time.  Test results with some old HDDs I had on a shelf
> can be found here.  As expected, concurrent performance to multiple targets
> is vastly improved.
>
> http://apollo.backplane.com/DFlyMisc/portmult01.txt
>
> However, I must again stress that AHCI-based port-multipliers tend to
> fall-over when errors occur, or even if you are just hot-swapping a drive
> sometimes.  Using a port-multiplier is not a good solution for serious
> setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
> in order to prevent it from competing with commercial SAS controllers.
>
> -Matt
>


From karu.pruun at gmail.com  Thu Dec 22 05:11:17 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Thu, 22 Dec 2016 15:11:17 +0200
Subject: question about hammer mirror-stream
In-Reply-To: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
 <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
Message-ID: <CADdF=MJsa-aDAGW6p2Fbx_ryD0TanKXASmUANDjXG5hVrEsc8w@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby <t+dfbsd at timdarby.net> wrote:

> On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:
>
>> then fileX will show up in /pfs/slave but not in /home/slave. It will
>> show up in the latter when I stop streaming, unmount the slave and mount it
>> again. Is this the intended behavior?
>>
>
> Yes, that's how it works. You won't see the updates to the mounted slave
> until you unmount and re-mount.
>


Thanks!

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161222/58827d29/attachment-0002.htm>

From vincent.delft at gmail.com  Mon Dec 26 03:30:12 2016
From: vincent.delft at gmail.com (vincent delft)
Date: Mon, 26 Dec 2016 12:30:12 +0100
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
Message-ID: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>

Hello,

First of all I'm new to DF.
Thanks for the nice install process. It's a quite easy to have DF running.
But my concerns are more linked to HammerFS.

My context is the following:
- I would like to have a NAS system running on a small machine: 4GB RAM
having a celeron CPU of 1.6GHz.
- This NAS host +- 700GB of data
- I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
outside the NAS for DRP reasons.
- I'm looking for a solution tackling the bit-rot problem.


Concerning HammerFS, I've still read lot of DF manuals, pages, ... :

http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
https://www.dragonflybsd.org/hammer/

So basically, I've understood that I must install DF on a small SSD disk
and will dedicated the 1TB disk to HammerFS.

But I have some questions:
- What must be the setup of HammerFS to solve "bit-rot" ? Should it be
Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
should I create 2 slices on the disk?

- For DRP reasons, how can I perform a backup on an external disk ? (the
disk is only connected 1x per month on the NAS). Should I do a "dd" ?
should I use the master-slave concepts ? Should I use cpdup ?

Many thanks for your replies.

Vincent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161226/81f21ada/attachment-0002.htm>

From kusumi.tomohiro at gmail.com  Mon Dec 26 04:15:08 2016
From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi)
Date: Mon, 26 Dec 2016 21:15:08 +0900
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <CAHndrBnCMCQ+3vb4HP2Ey+JEMfE_eGvZOKJ8UuObLA2rfg7B8A@mail.gmail.com>

In short, hammer doesn't solve (or fix) what you call bit-rot.


2016-12-26 20:30 GMT+09:00 vincent delft <vincent.delft at gmail.com>:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk and
> will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1
> or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ?
> Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2
> slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ? should
> I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>
>


From ipc at peercorpstrust.org  Mon Dec 26 05:06:24 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 26 Dec 2016 15:06:24 +0200
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org>

Hi Vincent,

Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine.

Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on.

HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication.

Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company:
https://bsdmag.org/siju_george/

HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs.

There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer

You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/

In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also.

If any of the above is in error, I imagine someone with deeper insight will chime in.

Mike

On 12/26/2016 01:30 PM, vincent delft wrote:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk
> and will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be
> Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
> filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
> should I create 2 slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ?
> should I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>


From dillon at backplane.com  Wed Dec 28 14:32:55 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Wed, 28 Dec 2016 14:32:55 -0800
Subject: Swap changes in master, enforcement of RLIMIT_RSS
Message-ID: <CAOZ7CpC5T=OeVR7-FP8f09D2giPEBde4fxboC4vemRisMxdgew@mail.gmail.com>

The maximum supported swap space has been significantly increased in
master.  The amount of core memory required to manage swap is approximately
1:128 (ram:swap-used).  The system allows up to half of your core ram to be
used for swap management so the maximum swap you can actually use on a
machine would be as follows:

With 1GB of ram it would be 512M*128 = 64GB of swap.

With 8GB of ram it would be 4G*128 = 512GB of swap.

With 128GB of ram it would be around ~8TB of swap.

The maximum the system supports due to KVM limitations is around ~32TB of
swap.

While you can configure more swap than the above limitations, the system
might not be able to actually manage as much as you configure.  I'm
considering improving the management code to reduce its memory
requirements.  Also, people shouldn't go overboard w/regards to configuring
swap space.  Configured swap does eat some memory resources even when not
actively in-use, and there might not be a practical use for having such
massive amounts of swap with your setup.

--

The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced
on a per-process basis.  By default this limit is infinity so nothing will
change for most people.  Setting a memoryuse limit has advantages in
certain circumstances as long as it isn't over-used.  For example, it might
be reasonable to start your browser with a 500m (500 megabyte) limit.  The
system will allow each process VSZ to grow larger, but will force the RSS
to the limit and devalue the pages it removes to make them more likely to
be recycled by the system.

There is a sysctl, vm.pageout_memuse_mode, to control the behavior.  It can
be set to 0, 1, or 2, and defaults to 1.  Setting it to 0 disables the
enforcement (system operates the same as it did before the commit).  The
default value of 1 enables enforcement by removing pages from the process
pmap and deactivating them passively.  The value 2 enables enforcement and
actively pages data out and frees pages.   Mode 2 can cause excessive
paging and wear out your SSD so it is not currently recommended.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161228/4ff8510c/attachment-0002.htm>

From t+dfbsd at timdarby.net  Thu Dec  1 08:27:40 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Thu, 1 Dec 2016 09:27:40 -0700
Subject: Notice to Golang users
Message-ID: <CAP3DVK+RYA_MjvSnLwwXD1FjsKbi1dJkipLCPS+kJXUTf-v08Q@mail.gmail.com>

If you use the Go compiler with Dragonfly (it's in dports), be aware that
the upcoming release of Go 1.8 will require DF 4.4.4 or later:

https://beta.golang.org/doc/go1.8

?The Go team had put a workaround in place for DF for a thread signal
handling bug. This was recently fixed by Matt, so they've removed the
workaround in Go 1.8. If you are using a version of DF prior to 4.4.4, you
can use Go 1.7 or earlier.

Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161201/6aa509a7/attachment-0003.htm>

From venture37 at geeklan.co.uk  Sun Dec  4 07:56:48 2016
From: venture37 at geeklan.co.uk (Sevan Janiyan)
Date: Sun, 4 Dec 2016 15:56:48 +0000
Subject: =?UTF-8?Q?Fwd:_BSD_Devroom_CFP_@_Frosdem'17_=e2=80=93_call_for_part?=
 =?UTF-8?Q?icipation_extended?=
Message-ID: <e0ee874b-d0a4-ff74-210c-4a913f653227@geeklan.co.uk>


Hi,

The BSD devroom move the submission deadline to Dec 10th.
There's still time to submit a talk !

Regards,
- rodrigo
On behalf of the BSD devroom

Fosdem 2017 BSD devroom Call for Participation
==============================================

Important dates
--------------------
* Conference date      : 4 & 5 February 2017 in Brussels, Belgium
* Devroom date         : Sunday 5 February 2016
* Submission deadline  : Sunday 27 November 2016
* Speaker notified     : Sunday 4 December 2016

The topic of the devroom includes all BSD operating systems and every
talk is welcome from hacker discussions to real-word examples and
presentations about new and shiny features.

Practical
--------------------
* The default duration for talks will be 45 minutes including
   discussion. Feel free to ask if you want to have a longer or
   a shorter slot.

* Presentations can be recorded and streamed, sending your proposal
   implies giving permission to be recorded. However, exceptions can
   be made for exceptional circumstances.


To submit your proposal, visit :
https://penta.fosdem.org/submission/FOSDEM17

Account creation
----------------------
If you already have a Pentabarf account, please *don't* recreate a new
one. If you forgot your password, reset it. If not, follow the
instructions to create an account.

Submit a talk
----------------------
Create an ?event? and click on "Show all" in the top right corner
to display the full form.

Your submission must include the following information

* The title and subtitle of your talk (please be descriptive, as titles
   will be listed with ~500 from other projects)
* select ?BSD devroom? as the track.
* A short abstract of one paragraph
* A longer description if you wish to do so
* Links to related websites/blogs etc.

- Rodrigo Osorio
On behalf of the BSD Devroom


From ipc at peercorpstrust.org  Sat Dec 10 03:19:14 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 13:19:14 +0200
Subject: Parallel compression not using all available CPU
Message-ID: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>

Hi,

I've observed that parallel compression tools such as pixz and lbzip2 do not make use of all of the available CPU under Dragonfly. On other OSes, it does.

When testing on a 50 gb file, using top I've observed that CPU idle percentages consistently hover around the 90% range for pixz and ~70% for lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until compression is complete. Correspondingly, compression takes significantly longer under Dragonfly, so the CPU is really being under utilized in this case as opposed to erroneous reporting by top.

This was tested on two systems, one 16c/32t and a 2c/2t system on a recent master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET 2016.

Has anyone else possibly observed this?

-- 
Mike


From justin at shiningsilence.com  Sat Dec 10 12:26:32 2016
From: justin at shiningsilence.com (Justin Sherrill)
Date: Sat, 10 Dec 2016 15:26:32 -0500
Subject: Parallel compression not using all available CPU
In-Reply-To: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
Message-ID: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>

On the two DragonFly systems, was it Hammer or UFS?  I would be
surprised if that made a difference, but it might?

On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
<ipc at peercorpstrust.org> wrote:
> Hi,
>
> I've observed that parallel compression tools such as pixz and lbzip2 do not
> make use of all of the available CPU under Dragonfly. On other OSes, it
> does.
>
> When testing on a 50 gb file, using top I've observed that CPU idle
> percentages consistently hover around the 90% range for pixz and ~70% for
> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
> compression is complete. Correspondingly, compression takes significantly
> longer under Dragonfly, so the CPU is really being under utilized in this
> case as opposed to erroneous reporting by top.
>
> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
> 2016.
>
> Has anyone else possibly observed this?
>
> --
> Mike
>


From ipc at peercorpstrust.org  Sat Dec 10 13:14:26 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sat, 10 Dec 2016 23:14:26 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
Message-ID: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>

Hi,

On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.


On 12/10/2016 10:26 PM, Justin Sherrill wrote:
> On the two DragonFly systems, was it Hammer or UFS?  I would be
> surprised if that made a difference, but it might?
>
> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
> <ipc at peercorpstrust.org> wrote:
>> Hi,
>>
>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>> make use of all of the available CPU under Dragonfly. On other OSes, it
>> does.
>>
>> When testing on a 50 gb file, using top I've observed that CPU idle
>> percentages consistently hover around the 90% range for pixz and ~70% for
>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>> compression is complete. Correspondingly, compression takes significantly
>> longer under Dragonfly, so the CPU is really being under utilized in this
>> case as opposed to erroneous reporting by top.
>>
>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>> 2016.
>>
>> Has anyone else possibly observed this?
>>
>> --
>> Mike
>>
>


From dillon at apollo.backplane.com  Sat Dec 10 13:29:24 2016
From: dillon at apollo.backplane.com (Matthew Dillon)
Date: Sat, 10 Dec 2016 13:29:24 -0800 (PST)
Subject: Parallel compression not using all available CPU
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <20161210212924.CF3EE18CB714C2@apollo.backplane.com>


:Hi,
:
:On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.

    Generally speaking I wouldn't want these utilities to use all available
    cpu threads by default.  It's not usually a good idea for a program to
    just assume that all the cpus are there for it (by default).  But they
    should have command line options that allow the number of threads to
    be specified.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


From jasse at yberwaffe.com  Sat Dec 10 16:00:02 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Sun, 11 Dec 2016 01:00:02 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
Message-ID: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>

Have you tried to disable hypertreads in the BIOS ???
It's a long shot, I know, but it might help.

On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
> Hi,
>
> On both systems HAMMER was used. One small correction concerning the 
> 2c/2t machine, both compression programs did effectively utilize that 
> CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t 
> where the CPU isn't effectively maxed out. I'll continue to try and 
> investigate why and report back if I find anything.
>
>
> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>> surprised if that made a difference, but it might?
>>
>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>> <ipc at peercorpstrust.org> wrote:
>>> Hi,
>>>
>>> I've observed that parallel compression tools such as pixz and 
>>> lbzip2 do not
>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>> does.
>>>
>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>> percentages consistently hover around the 90% range for pixz and 
>>> ~70% for
>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>> idle until
>>> compression is complete. Correspondingly, compression takes 
>>> significantly
>>> longer under Dragonfly, so the CPU is really being under utilized in 
>>> this
>>> case as opposed to erroneous reporting by top.
>>>
>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a 
>>> recent
>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>> 11:44:04 EET
>>> 2016.
>>>
>>> Has anyone else possibly observed this?
>>>
>>> -- 
>>> Mike
>>>
>>
>


From ipc at peercorpstrust.org  Sun Dec 11 13:22:44 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Sun, 11 Dec 2016 23:22:44 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
Message-ID: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>

Hi,

It turns out that it was a combination of two things - turning off hyperthreading in BIOS and using a faster disk.

I found a post from the author of lbzip2 which seems to describe what might be happening in this case, but reference was made to a user using an i5 mobile CPU:

########################################################################

"bzip2 author here. I strongly suspect that you see what you see because your Intel core i5 is probably only dual core PLUS hyper-threaded, not real quad-core. Meaning, you have two instances of the L2 per-core cache, not four, and each two hyperthreads share an L2 cache.

Since the bzip2 compression/decompression is very cache sensitive (see "man bzip2"), the scaling factor will be determined mostly by how many OS-threads can dispose over a dedicated cache each. In your case this number is probably 2.

Since you run two threads per core, those contend for the shared L2 cache, basically each messing with the other (flushing / invalidating the shared cache for the other). This contention shows up as double CPU time, because "waiting for cache" (or "waiting for main memory") is accounted for as CPU time.

Hyperthreading is not useful but detrimental for lbzip2; so you should export LBZIP2="-n 2". You should not run more worker threads per core than: core-dedicated-cache-size divided by 8MB."

########################################################################

Running the compression again on the same file from an SSD with hyperthreading turned off, I was able to fully saturate all of the cores using lbzip2. None of this seemed obvious at first, but it rectified the situation. The biggest difference came from turning off hyperthreading (idle CPU - 20% vs the previous 90%) and then running from an SSD with hyperthreading turned off (idle CPU = 0%).

Previously, the compression was run from a single HDD, not an SSD. Concerning the compression test using the same HDD under FreeBSD, well I don't know why it was able to saturate the CPU. Perhaps it has something to do with ZFS's aggressive caching. Turning that off and re-running the test would likely answer the question. Pixz performed similarly when the above two modifications were made.


On 12/11/2016 02:00 AM, Jasse Jansson wrote:
> Have you tried to disable hypertreads in the BIOS ???
> It's a long shot, I know, but it might help.
>
> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>> Hi,
>>
>> On both systems HAMMER was used. One small correction concerning the 2c/2t machine, both compression programs did effectively utilize that CPU which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU isn't effectively maxed out. I'll continue to try and investigate why and report back if I find anything.
>>
>>
>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>> surprised if that made a difference, but it might?
>>>
>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>> <ipc at peercorpstrust.org> wrote:
>>>> Hi,
>>>>
>>>> I've observed that parallel compression tools such as pixz and lbzip2 do not
>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>> does.
>>>>
>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>> percentages consistently hover around the 90% range for pixz and ~70% for
>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle until
>>>> compression is complete. Correspondingly, compression takes significantly
>>>> longer under Dragonfly, so the CPU is really being under utilized in this
>>>> case as opposed to erroneous reporting by top.
>>>>
>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a recent
>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04 EET
>>>> 2016.
>>>>
>>>> Has anyone else possibly observed this?
>>>>
>>>> --
>>>> Mike
>>>>
>>>
>>
>
>


From dillon at backplane.com  Sun Dec 11 15:37:36 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sun, 11 Dec 2016 15:37:36 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
Message-ID: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>

That doesn't make any sense.  It sounds like it is just compressing more
slowly, so there is less idle time because the HDD/SSD is able to keep up
due to it compressing more slowly.  You don't want to turn off
hyperthreading in the BIOS and cache coherency stalls will not show up in
the idle% anyway.

-Matt

On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
ipc at peercorpstrust.org> wrote:

> Hi,
>
> It turns out that it was a combination of two things - turning off
> hyperthreading in BIOS and using a faster disk.
>
> I found a post from the author of lbzip2 which seems to describe what
> might be happening in this case, but reference was made to a user using an
> i5 mobile CPU:
>
> ########################################################################
>
> "bzip2 author here. I strongly suspect that you see what you see because
> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
> quad-core. Meaning, you have two instances of the L2 per-core cache, not
> four, and each two hyperthreads share an L2 cache.
>
> Since the bzip2 compression/decompression is very cache sensitive (see
> "man bzip2"), the scaling factor will be determined mostly by how many
> OS-threads can dispose over a dedicated cache each. In your case this
> number is probably 2.
>
> Since you run two threads per core, those contend for the shared L2 cache,
> basically each messing with the other (flushing / invalidating the shared
> cache for the other). This contention shows up as double CPU time, because
> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
> time.
>
> Hyperthreading is not useful but detrimental for lbzip2; so you should
> export LBZIP2="-n 2". You should not run more worker threads per core than:
> core-dedicated-cache-size divided by 8MB."
>
> ########################################################################
>
> Running the compression again on the same file from an SSD with
> hyperthreading turned off, I was able to fully saturate all of the cores
> using lbzip2. None of this seemed obvious at first, but it rectified the
> situation. The biggest difference came from turning off hyperthreading
> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
> hyperthreading turned off (idle CPU = 0%).
>
> Previously, the compression was run from a single HDD, not an SSD.
> Concerning the compression test using the same HDD under FreeBSD, well I
> don't know why it was able to saturate the CPU. Perhaps it has something to
> do with ZFS's aggressive caching. Turning that off and re-running the test
> would likely answer the question. Pixz performed similarly when the above
> two modifications were made.
>
>
>
>
> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>
>> Have you tried to disable hypertreads in the BIOS ???
>> It's a long shot, I know, but it might help.
>>
>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>
>>> Hi,
>>>
>>> On both systems HAMMER was used. One small correction concerning the
>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>> report back if I find anything.
>>>
>>>
>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>
>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>> surprised if that made a difference, but it might?
>>>>
>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>> <ipc at peercorpstrust.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>> do not
>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>> does.
>>>>>
>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>> for
>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>> until
>>>>> compression is complete. Correspondingly, compression takes
>>>>> significantly
>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>> this
>>>>> case as opposed to erroneous reporting by top.
>>>>>
>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>> recent
>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>> EET
>>>>> 2016.
>>>>>
>>>>> Has anyone else possibly observed this?
>>>>>
>>>>> --
>>>>> Mike
>>>>>
>>>>>
>>>>
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161211/dbbb00e9/attachment-0003.htm>

From ipc at peercorpstrust.org  Sun Dec 11 22:14:29 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 12 Dec 2016 08:14:29 +0200
Subject: Parallel compression not using all available CPU
In-Reply-To: <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
Message-ID: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>

Thanks for this.

Is there ever a side case where hyperthreading might have unpredictable results or should it generally always be left on?

On 12/12/2016 01:37 AM, Matthew Dillon wrote:
> That doesn't make any sense.  It sounds like it is just compressing more
> slowly, so there is less idle time because the HDD/SSD is able to keep up
> due to it compressing more slowly.  You don't want to turn off
> hyperthreading in the BIOS and cache coherency stalls will not show up in
> the idle% anyway.
>
> -Matt
>
> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
> ipc at peercorpstrust.org> wrote:
>
>> Hi,
>>
>> It turns out that it was a combination of two things - turning off
>> hyperthreading in BIOS and using a faster disk.
>>
>> I found a post from the author of lbzip2 which seems to describe what
>> might be happening in this case, but reference was made to a user using an
>> i5 mobile CPU:
>>
>> ########################################################################
>>
>> "bzip2 author here. I strongly suspect that you see what you see because
>> your Intel core i5 is probably only dual core PLUS hyper-threaded, not real
>> quad-core. Meaning, you have two instances of the L2 per-core cache, not
>> four, and each two hyperthreads share an L2 cache.
>>
>> Since the bzip2 compression/decompression is very cache sensitive (see
>> "man bzip2"), the scaling factor will be determined mostly by how many
>> OS-threads can dispose over a dedicated cache each. In your case this
>> number is probably 2.
>>
>> Since you run two threads per core, those contend for the shared L2 cache,
>> basically each messing with the other (flushing / invalidating the shared
>> cache for the other). This contention shows up as double CPU time, because
>> "waiting for cache" (or "waiting for main memory") is accounted for as CPU
>> time.
>>
>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>> export LBZIP2="-n 2". You should not run more worker threads per core than:
>> core-dedicated-cache-size divided by 8MB."
>>
>> ########################################################################
>>
>> Running the compression again on the same file from an SSD with
>> hyperthreading turned off, I was able to fully saturate all of the cores
>> using lbzip2. None of this seemed obvious at first, but it rectified the
>> situation. The biggest difference came from turning off hyperthreading
>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>> hyperthreading turned off (idle CPU = 0%).
>>
>> Previously, the compression was run from a single HDD, not an SSD.
>> Concerning the compression test using the same HDD under FreeBSD, well I
>> don't know why it was able to saturate the CPU. Perhaps it has something to
>> do with ZFS's aggressive caching. Turning that off and re-running the test
>> would likely answer the question. Pixz performed similarly when the above
>> two modifications were made.
>>
>>
>>
>>
>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>
>>> Have you tried to disable hypertreads in the BIOS ???
>>> It's a long shot, I know, but it might help.
>>>
>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>
>>>> Hi,
>>>>
>>>> On both systems HAMMER was used. One small correction concerning the
>>>> 2c/2t machine, both compression programs did effectively utilize that CPU
>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t where the CPU
>>>> isn't effectively maxed out. I'll continue to try and investigate why and
>>>> report back if I find anything.
>>>>
>>>>
>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>
>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>> surprised if that made a difference, but it might?
>>>>>
>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I've observed that parallel compression tools such as pixz and lbzip2
>>>>>> do not
>>>>>> make use of all of the available CPU under Dragonfly. On other OSes, it
>>>>>> does.
>>>>>>
>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>> percentages consistently hover around the 90% range for pixz and ~70%
>>>>>> for
>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% idle
>>>>>> until
>>>>>> compression is complete. Correspondingly, compression takes
>>>>>> significantly
>>>>>> longer under Dragonfly, so the CPU is really being under utilized in
>>>>>> this
>>>>>> case as opposed to erroneous reporting by top.
>>>>>>
>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>> recent
>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 11:44:04
>>>>>> EET
>>>>>> 2016.
>>>>>>
>>>>>> Has anyone else possibly observed this?
>>>>>>
>>>>>> --
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>


From jasse at yberwaffe.com  Mon Dec 12 05:41:40 2016
From: jasse at yberwaffe.com (Jasse Jansson)
Date: Mon, 12 Dec 2016 14:41:40 +0100
Subject: Parallel compression not using all available CPU
In-Reply-To: <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
Message-ID: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>

They recommend to turn hyperthreading off if you run studio software on 
your computer.
That's if you run Windows, I have no idea if HT affects a Unix 
derivative anyway.

On 2016-12-12 07:14, PeerCorps Trust Fund wrote:
> Thanks for this.
>
> Is there ever a side case where hyperthreading might have 
> unpredictable results or should it generally always be left on?
>
> On 12/12/2016 01:37 AM, Matthew Dillon wrote:
>> That doesn't make any sense.  It sounds like it is just compressing more
>> slowly, so there is less idle time because the HDD/SSD is able to 
>> keep up
>> due to it compressing more slowly.  You don't want to turn off
>> hyperthreading in the BIOS and cache coherency stalls will not show 
>> up in
>> the idle% anyway.
>>
>> -Matt
>>
>> On Sun, Dec 11, 2016 at 1:22 PM, PeerCorps Trust Fund <
>> ipc at peercorpstrust.org> wrote:
>>
>>> Hi,
>>>
>>> It turns out that it was a combination of two things - turning off
>>> hyperthreading in BIOS and using a faster disk.
>>>
>>> I found a post from the author of lbzip2 which seems to describe what
>>> might be happening in this case, but reference was made to a user 
>>> using an
>>> i5 mobile CPU:
>>>
>>> ######################################################################## 
>>>
>>>
>>> "bzip2 author here. I strongly suspect that you see what you see 
>>> because
>>> your Intel core i5 is probably only dual core PLUS hyper-threaded, 
>>> not real
>>> quad-core. Meaning, you have two instances of the L2 per-core cache, 
>>> not
>>> four, and each two hyperthreads share an L2 cache.
>>>
>>> Since the bzip2 compression/decompression is very cache sensitive (see
>>> "man bzip2"), the scaling factor will be determined mostly by how many
>>> OS-threads can dispose over a dedicated cache each. In your case this
>>> number is probably 2.
>>>
>>> Since you run two threads per core, those contend for the shared L2 
>>> cache,
>>> basically each messing with the other (flushing / invalidating the 
>>> shared
>>> cache for the other). This contention shows up as double CPU time, 
>>> because
>>> "waiting for cache" (or "waiting for main memory") is accounted for 
>>> as CPU
>>> time.
>>>
>>> Hyperthreading is not useful but detrimental for lbzip2; so you should
>>> export LBZIP2="-n 2". You should not run more worker threads per 
>>> core than:
>>> core-dedicated-cache-size divided by 8MB."
>>>
>>> ######################################################################## 
>>>
>>>
>>> Running the compression again on the same file from an SSD with
>>> hyperthreading turned off, I was able to fully saturate all of the 
>>> cores
>>> using lbzip2. None of this seemed obvious at first, but it rectified 
>>> the
>>> situation. The biggest difference came from turning off hyperthreading
>>> (idle CPU - 20% vs the previous 90%) and then running from an SSD with
>>> hyperthreading turned off (idle CPU = 0%).
>>>
>>> Previously, the compression was run from a single HDD, not an SSD.
>>> Concerning the compression test using the same HDD under FreeBSD, 
>>> well I
>>> don't know why it was able to saturate the CPU. Perhaps it has 
>>> something to
>>> do with ZFS's aggressive caching. Turning that off and re-running 
>>> the test
>>> would likely answer the question. Pixz performed similarly when the 
>>> above
>>> two modifications were made.
>>>
>>>
>>>
>>>
>>> On 12/11/2016 02:00 AM, Jasse Jansson wrote:
>>>
>>>> Have you tried to disable hypertreads in the BIOS ???
>>>> It's a long shot, I know, but it might help.
>>>>
>>>> On 2016-12-10 22:14, PeerCorps Trust Fund wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> On both systems HAMMER was used. One small correction concerning the
>>>>> 2c/2t machine, both compression programs did effectively utilize 
>>>>> that CPU
>>>>> which had an idle % of 0.0. It is the bigger machine, 16c/32t 
>>>>> where the CPU
>>>>> isn't effectively maxed out. I'll continue to try and investigate 
>>>>> why and
>>>>> report back if I find anything.
>>>>>
>>>>>
>>>>> On 12/10/2016 10:26 PM, Justin Sherrill wrote:
>>>>>
>>>>>> On the two DragonFly systems, was it Hammer or UFS?  I would be
>>>>>> surprised if that made a difference, but it might?
>>>>>>
>>>>>> On Sat, Dec 10, 2016 at 6:19 AM, PeerCorps Trust Fund
>>>>>> <ipc at peercorpstrust.org> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've observed that parallel compression tools such as pixz and 
>>>>>>> lbzip2
>>>>>>> do not
>>>>>>> make use of all of the available CPU under Dragonfly. On other 
>>>>>>> OSes, it
>>>>>>> does.
>>>>>>>
>>>>>>> When testing on a 50 gb file, using top I've observed that CPU idle
>>>>>>> percentages consistently hover around the 90% range for pixz and 
>>>>>>> ~70%
>>>>>>> for
>>>>>>> lbzip2. These values under FreeBSD and Linux are typically ~0.0% 
>>>>>>> idle
>>>>>>> until
>>>>>>> compression is complete. Correspondingly, compression takes
>>>>>>> significantly
>>>>>>> longer under Dragonfly, so the CPU is really being under 
>>>>>>> utilized in
>>>>>>> this
>>>>>>> case as opposed to erroneous reporting by top.
>>>>>>>
>>>>>>> This was tested on two systems, one 16c/32t and a 2c/2t system on a
>>>>>>> recent
>>>>>>> master DragonFly v4.7.0.973.g8d7da-DEVELOPMENT #2: Wed Dec  7 
>>>>>>> 11:44:04
>>>>>>> EET
>>>>>>> 2016.
>>>>>>>
>>>>>>> Has anyone else possibly observed this?
>>>>>>>
>>>>>>> -- 
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>


From dillon at backplane.com  Tue Dec 13 12:56:41 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Tue, 13 Dec 2016 12:56:41 -0800
Subject: Parallel compression not using all available CPU
In-Reply-To: <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
References: <b1dd261b-5188-82a6-2dda-a114d6a6b4b6@peercorpstrust.org>
 <CA+YR8=0S=JJ7pte-VOEvuku4PUzBEdErC8BorNXYzWW3QZiN9A@mail.gmail.com>
 <7506e4e8-e585-9d5a-83bd-6093543b9c12@peercorpstrust.org>
 <46efd5d1-c3f1-506d-f447-ceb4e7887231@yberwaffe.com>
 <102ca954-076f-b0f4-f6d3-cd90ec9c11a0@peercorpstrust.org>
 <CAOZ7CpBJFQDuYh2zX+Drs-REQJZf4uPfvZUX7jZkb5F6R0R07g@mail.gmail.com>
 <544eb02b-16f1-5673-630e-126a1d3edec7@peercorpstrust.org>
 <3490623b-43d0-25ed-f538-1ba169ee9b89@yberwaffe.com>
Message-ID: <CAOZ7CpDz_zKSr4OfQTQWcGApG=mNsqGv7gO4b90BJCzrohb5dg@mail.gmail.com>

Never turn hyper-threading off if running (any) BSD or Linux.  The
schedulers understand hyperthreads.

In terms of compression programs, presumably they have an option to specify
how many threads to create to do the compression.  Simply test with 1, 2,
and 4 threads to determine the optimal number.

-Matt
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161213/0eb2e7f6/attachment-0003.htm>

From dillon at backplane.com  Sat Dec 17 12:10:35 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:10:35 -0800
Subject: Graphics-related memory leak fixed in master for post-haswell machines
Message-ID: <CAOZ7CpDurD7AD-3E5sGnZBBCOPLt+0PPoDg=H6mV7EXivCT8Kw@mail.gmail.com>

A memory leak has been fixed in master's drm (gpu) code which affects
post-haswell intel GPUs.  i.e. Broadwell, Skylake, etc.  This leak would
cause a OOM panic typically after a couple of weeks of uptime.  The leak
does not affect older Haswell based intel systems.  A number of other
stability improvements have also gone into the system.

The release branch got some of the stability work but the fix for the
graphics memory leak is a bit too complex to backport in a compatible
manner.  At the current juncture, anyone running a workstation with release
might want to switch to master.  Master will be *much* better on broadwell
or later GPUs.

Also somewhat related, chromium (chrome) has been updated from v52 to v54
on both branches, which fixes some serious GUI issues on high-dpi (aka 4K)
monitors.  Chrome 54 still has some issues with certain sites, but
generally speaking it seems to be pretty stable and I recommend its use.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/1da04d0a/attachment-0003.htm>

From dillon at backplane.com  Sat Dec 17 12:19:31 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Sat, 17 Dec 2016 12:19:31 -0800
Subject: Setting up a more secure browser environment
Message-ID: <CAOZ7CpAYYTK_QsAfQ4vEKT6BLSnBtm04+88XJ=NPemwoW187Pw@mail.gmail.com>

I have created a new page in the handbook describing how to set up a secure
browser environment using isolation accounts.  We recommend running your
browser this way instead of running it directly from your main workstation
account.

https://www.dragonflybsd.org/docs/docs/handbook/RunSecureBrowser/

Currently chrome is considered to be the most secure.  Now that chrome 54
is in dports, it should generate a pretty good experience for users.  There
are some sites which still implode on it (and in FreeBSD too, that isn't a
DFly-specific problem), and for those you may have to run firefox.  My
personal preference is chrome with some adjustments to the setup.

For people running chrome on a 4K monitor, I recommend running chrome with
the "--force-device-scale-factor=1.5" option before you start messing with
font or other settings.  It will scale the UI a little closer to what you
probably want.

DragonFly master's support through Intel Skylake is now very good.  I am
running a Skylake box as my workstation now and it has been very stable.
Theoretically we have some Karby support in there too but no guarantees on
stability.  Radeon support has also seen some improvement though not at the
same pace as Intel GPU support has seen.

At this point, DragonFly runs extremely well even on mobile chipsets
(though on a mobile chipset, a 1920x1080 monitor is a better bet than 4K in
terms of performance).  I recommend a minimum of 4GB of ram if using X.
DragonFly will run X fine on 2GB, but if you do so you need a fast SATA or
NVMe SSD and a large swap partition to page to for decent browser
performance.  Browsers are still big memory hogs.  Eventually we will have
chrome 55 support which is supposed to eat much less memory, but not yet.
 chrome 54 is where we are now.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161217/e050f279/attachment-0003.htm>

From karu.pruun at gmail.com  Mon Dec 19 05:21:56 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Mon, 19 Dec 2016 15:21:56 +0200
Subject: question about hammer mirror-stream
Message-ID: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>

Hello

A quick question regarding hammer mirror-streaming. The master and slave
are mounted as

# mount_null /pfs/master /home/master
# mount_null /pfs/slave /home/slave

and streaming is ongoing with

# hammer mirror-stream /home/master /home/slave

When there's a change in master, e.g.

% sudo touch /home/master/fileX

then fileX will show up in /pfs/slave but not in /home/slave. It will show
up in the latter when I stop streaming, unmount the slave and mount it
again. Is this the intended behavior?

Thanks

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/e2caa040/attachment-0003.htm>

From t+dfbsd at timdarby.net  Mon Dec 19 08:50:06 2016
From: t+dfbsd at timdarby.net (Tim Darby)
Date: Mon, 19 Dec 2016 09:50:06 -0700
Subject: question about hammer mirror-stream
In-Reply-To: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
Message-ID: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:

> then fileX will show up in /pfs/slave but not in /home/slave. It will show
> up in the latter when I stop streaming, unmount the slave and mount it
> again. Is this the intended behavior?
>

Yes, that's how it works. You won't see the updates to the mounted slave
until you unmount and re-mount.

Tim?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/07448b6c/attachment-0003.htm>

From dillon at backplane.com  Mon Dec 19 10:35:29 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Mon, 19 Dec 2016 10:35:29 -0800
Subject: AHCI driver on master - heads up
Message-ID: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>

Some work has gone into the AHCI driver on master.  It should not affect
anyone, but I'm posting a heads-up because this is the primary SATA storage
device used on most machines and the changes made are significant enough
that they could conceivably cause an issue.  So if you are using master, be
sure you have a good backup kernel in case something goes wrong.

The easiest way to create a backup kernel is to first install the new
kernel.  Your previous kernel will be stripped and copied to
/boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:

cpdup /boot/kernel.old /boot/kernel.bak

This way you don't waste space in /boot copying kernels and modules with
full debug symbols.  Having a kernel.bak is a good idea because it will
never be overwritten by an installkernel or other build target.  It will
show up in the loader under 'b' for backup kernel.

--

Experimental FIS-Based Switching support has been added to the AHCI
driver's port-multiplier code.  Most AHCI chipsets do not have this
capability and the few that do tend to have fairly buggy hardware, so I am
not recommending that port multipliers be used generally.  However, after
years of waiting I finally got my hands on a mobo that had the feature and
I've wanted to implement it forever, so I have done so.

The FBS feature allows the computer to queue concurrent commands to
multiple targets behind a port-multiplier.  Without it, only one target can
be addressed at a time.  Test results with some old HDDs I had on a shelf
can be found here.  As expected, concurrent performance to multiple targets
is vastly improved.

http://apollo.backplane.com/DFlyMisc/portmult01.txt

However, I must again stress that AHCI-based port-multipliers tend to
fall-over when errors occur, or even if you are just hot-swapping a drive
sometimes.  Using a port-multiplier is not a good solution for serious
setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
in order to prevent it from competing with commercial SAS controllers.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161219/ba061b2e/attachment-0003.htm>

From ipc at peercorpstrust.org  Mon Dec 19 11:14:07 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 19 Dec 2016 21:14:07 +0200
Subject: AHCI driver on master - heads up
In-Reply-To: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
References: <CAOZ7CpA+ug5CSd5fRBRpHR77maH2XbWMCYCZ9GUQ+GNHRUe8pw@mail.gmail.com>
Message-ID: <5c029dfe-2b7f-72a7-9d15-11ce3f93c62f@peercorpstrust.org>

This is interesting.

AsRock Rack has quite a number of server/workstation boards that has the Marvell 88SE9172 AHCI controller in particular (9 currently). We have two of them on our end, so this is certainly very useful for us. Thank you!


On 12/19/2016 08:35 PM, Matthew Dillon wrote:
> Some work has gone into the AHCI driver on master.  It should not affect
> anyone, but I'm posting a heads-up because this is the primary SATA storage
> device used on most machines and the changes made are significant enough
> that they could conceivably cause an issue.  So if you are using master, be
> sure you have a good backup kernel in case something goes wrong.
>
> The easiest way to create a backup kernel is to first install the new
> kernel.  Your previous kernel will be stripped and copied to
> /boot/kernel.old.  Then copy /boot/kernel.old to /boot/kernel.bak using:
>
> cpdup /boot/kernel.old /boot/kernel.bak
>
> This way you don't waste space in /boot copying kernels and modules with
> full debug symbols.  Having a kernel.bak is a good idea because it will
> never be overwritten by an installkernel or other build target.  It will
> show up in the loader under 'b' for backup kernel.
>
> --
>
> Experimental FIS-Based Switching support has been added to the AHCI
> driver's port-multiplier code.  Most AHCI chipsets do not have this
> capability and the few that do tend to have fairly buggy hardware, so I am
> not recommending that port multipliers be used generally.  However, after
> years of waiting I finally got my hands on a mobo that had the feature and
> I've wanted to implement it forever, so I have done so.
>
> The FBS feature allows the computer to queue concurrent commands to
> multiple targets behind a port-multiplier.  Without it, only one target can
> be addressed at a time.  Test results with some old HDDs I had on a shelf
> can be found here.  As expected, concurrent performance to multiple targets
> is vastly improved.
>
> http://apollo.backplane.com/DFlyMisc/portmult01.txt
>
> However, I must again stress that AHCI-based port-multipliers tend to
> fall-over when errors occur, or even if you are just hot-swapping a drive
> sometimes.  Using a port-multiplier is not a good solution for serious
> setups.  Blame Intel for intentionally screwing up the consumer AHCI spec
> in order to prevent it from competing with commercial SAS controllers.
>
> -Matt
>


From karu.pruun at gmail.com  Thu Dec 22 05:11:17 2016
From: karu.pruun at gmail.com (karu.pruun)
Date: Thu, 22 Dec 2016 15:11:17 +0200
Subject: question about hammer mirror-stream
In-Reply-To: <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
References: <CADdF=MKQfS2L86E1__9PAUd1jAQJoHVYiR2fK2sjXnBC0J3=Ew@mail.gmail.com>
 <CAP3DVKJPv7EirFujE=qn4z+q_mY4yqO1mfb5ACd8SWss3VhyFQ@mail.gmail.com>
Message-ID: <CADdF=MJsa-aDAGW6p2Fbx_ryD0TanKXASmUANDjXG5hVrEsc8w@mail.gmail.com>

On Mon, Dec 19, 2016 at 6:50 PM, Tim Darby <t+dfbsd at timdarby.net> wrote:

> On Mon, Dec 19, 2016 at 6:21 AM, karu.pruun <karu.pruun at gmail.com> wrote:
>
>> then fileX will show up in /pfs/slave but not in /home/slave. It will
>> show up in the latter when I stop streaming, unmount the slave and mount it
>> again. Is this the intended behavior?
>>
>
> Yes, that's how it works. You won't see the updates to the mounted slave
> until you unmount and re-mount.
>


Thanks!

Peeter

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161222/58827d29/attachment-0003.htm>

From vincent.delft at gmail.com  Mon Dec 26 03:30:12 2016
From: vincent.delft at gmail.com (vincent delft)
Date: Mon, 26 Dec 2016 12:30:12 +0100
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
Message-ID: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>

Hello,

First of all I'm new to DF.
Thanks for the nice install process. It's a quite easy to have DF running.
But my concerns are more linked to HammerFS.

My context is the following:
- I would like to have a NAS system running on a small machine: 4GB RAM
having a celeron CPU of 1.6GHz.
- This NAS host +- 700GB of data
- I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
outside the NAS for DRP reasons.
- I'm looking for a solution tackling the bit-rot problem.


Concerning HammerFS, I've still read lot of DF manuals, pages, ... :

http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
https://www.dragonflybsd.org/hammer/

So basically, I've understood that I must install DF on a small SSD disk
and will dedicated the 1TB disk to HammerFS.

But I have some questions:
- What must be the setup of HammerFS to solve "bit-rot" ? Should it be
Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
should I create 2 slices on the disk?

- For DRP reasons, how can I perform a backup on an external disk ? (the
disk is only connected 1x per month on the NAS). Should I do a "dd" ?
should I use the master-slave concepts ? Should I use cpdup ?

Many thanks for your replies.

Vincent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161226/81f21ada/attachment-0003.htm>

From kusumi.tomohiro at gmail.com  Mon Dec 26 04:15:08 2016
From: kusumi.tomohiro at gmail.com (Tomohiro Kusumi)
Date: Mon, 26 Dec 2016 21:15:08 +0900
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <CAHndrBnCMCQ+3vb4HP2Ey+JEMfE_eGvZOKJ8UuObLA2rfg7B8A@mail.gmail.com>

In short, hammer doesn't solve (or fix) what you call bit-rot.


2016-12-26 20:30 GMT+09:00 vincent delft <vincent.delft at gmail.com>:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk and
> will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be Raid1
> or Raid5, ... ? Can it be done on 1 physical disk (thus 2 filesystems) ?
> Should it be 2 PFS in 1 HammerFS located on 1 slice or should I create 2
> slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ? should
> I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>
>


From ipc at peercorpstrust.org  Mon Dec 26 05:06:24 2016
From: ipc at peercorpstrust.org (PeerCorps Trust Fund)
Date: Mon, 26 Dec 2016 15:06:24 +0200
Subject: what is the best Hammer setup to taclke the bitrot problem + DRP
In-Reply-To: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
References: <CANb0fm_tAz+5+dU2QW1PUkUZ_3=tnbJVU+cKY+0ZZTO4hS2eng@mail.gmail.com>
Message-ID: <6e6029eb-5097-4411-2cfc-7e6e0936d0b6@peercorpstrust.org>

Hi Vincent,

Most users do go with the OS installed on an SSD (120-256 GB or more), and depending on your storage needs, you can 'hammer mirror-stream' to mirror a main storage drive to a second in the same machine as well as to a third drive in a remote machine.

Each of these disks can have a different history retention policy. The ability to access filesystem history gives you some safeguards against for instance an accidental deletion, or a rogue program that decides to write garbage to a file (as happened to me recently). If one of those disks should die, you have two other live full copies of all of your data that you can access. No need to rebuild or resilver anything, just promote the backup (slave) disk to a main (master) disk and move on.

HAMMER has checksums which will allow you to detect bitrot and you can use a few different strategies to proactively check for it, but nothing is going to try to "repair" a file as might be the case elsewhere (HAMMER 2 seems to address this though). What you can do instead when you encounter some bad data is cd into your filesystem history (provided you had a sufficiently long enough retention policy) and look for a good copy before it was corrupted. This might also be on a different disk in another machine if you made use of remote replication.

Siju George has written a nice commentary about how he made use of DragonflyBSD and HAMMER in his company:
https://bsdmag.org/siju_george/

HAMMER doesn't do RAID on its own if that is what you are looking for. You will need to get a real hardware RAID card. Areca makes cards that are supported very well under Dragonfly. HAMMER's built-in filesystem replication features have some advantages over traditional RAID depending on your use case and needs.

There is also a nice guide on BSDNow about HAMMERfs which might give you some additional ideas: https://www.bsdnow.tv/tutorials/hammer

You can check out this for mirroring ideas as well: https://www.dragonflybsd.org/docs/how_to_implement_hammer_pseudo_file_system__40___pfs___41___slave_mirroring_from_pfs_master/

In your case where a disk is only connected once per month, you can use 'hammer mirror-copy' to backup your data when it is needed. For something more automatic 'hammer mirror-stream' should work also.

If any of the above is in error, I imagine someone with deeper insight will chime in.

Mike

On 12/26/2016 01:30 PM, vincent delft wrote:
> Hello,
>
> First of all I'm new to DF.
> Thanks for the nice install process. It's a quite easy to have DF running.
> But my concerns are more linked to HammerFS.
>
> My context is the following:
> - I would like to have a NAS system running on a small machine: 4GB RAM
> having a celeron CPU of 1.6GHz.
> - This NAS host +- 700GB of data
> - I plan to install 1 disk of 1TB in the NAS and keep a second disk of 1TB
> outside the NAS for DRP reasons.
> - I'm looking for a solution tackling the bit-rot problem.
>
>
> Concerning HammerFS, I've still read lot of DF manuals, pages, ... :
>
> http://lists.dragonflybsd.org/pipermail/users/2015-March/207585.html
> http://lists.dragonflybsd.org/pipermail/users/2006-June/297848.html
> http://lists.dragonflybsd.org/pipermail/users/2016-March/228659.html
> http://lists.dragonflybsd.org/pipermail/users/2015-January/311705.html
> https://www.dragonflybsd.org/hammer/
>
> So basically, I've understood that I must install DF on a small SSD disk
> and will dedicated the 1TB disk to HammerFS.
>
> But I have some questions:
> - What must be the setup of HammerFS to solve "bit-rot" ? Should it be
> Raid1 or Raid5, ... ? Can it be done on 1 physical disk (thus 2
> filesystems) ? Should it be 2 PFS in 1 HammerFS located on 1 slice or
> should I create 2 slices on the disk?
>
> - For DRP reasons, how can I perform a backup on an external disk ? (the
> disk is only connected 1x per month on the NAS). Should I do a "dd" ?
> should I use the master-slave concepts ? Should I use cpdup ?
>
> Many thanks for your replies.
>
> Vincent
>


From dillon at backplane.com  Wed Dec 28 14:32:55 2016
From: dillon at backplane.com (Matthew Dillon)
Date: Wed, 28 Dec 2016 14:32:55 -0800
Subject: Swap changes in master, enforcement of RLIMIT_RSS
Message-ID: <CAOZ7CpC5T=OeVR7-FP8f09D2giPEBde4fxboC4vemRisMxdgew@mail.gmail.com>

The maximum supported swap space has been significantly increased in
master.  The amount of core memory required to manage swap is approximately
1:128 (ram:swap-used).  The system allows up to half of your core ram to be
used for swap management so the maximum swap you can actually use on a
machine would be as follows:

With 1GB of ram it would be 512M*128 = 64GB of swap.

With 8GB of ram it would be 4G*128 = 512GB of swap.

With 128GB of ram it would be around ~8TB of swap.

The maximum the system supports due to KVM limitations is around ~32TB of
swap.

While you can configure more swap than the above limitations, the system
might not be able to actually manage as much as you configure.  I'm
considering improving the management code to reduce its memory
requirements.  Also, people shouldn't go overboard w/regards to configuring
swap space.  Configured swap does eat some memory resources even when not
actively in-use, and there might not be a practical use for having such
massive amounts of swap with your setup.

--

The memoryuse resource limit (ulimit -m in /bin/sh) is now being enforced
on a per-process basis.  By default this limit is infinity so nothing will
change for most people.  Setting a memoryuse limit has advantages in
certain circumstances as long as it isn't over-used.  For example, it might
be reasonable to start your browser with a 500m (500 megabyte) limit.  The
system will allow each process VSZ to grow larger, but will force the RSS
to the limit and devalue the pages it removes to make them more likely to
be recycled by the system.

There is a sysctl, vm.pageout_memuse_mode, to control the behavior.  It can
be set to 0, 1, or 2, and defaults to 1.  Setting it to 0 disables the
enforcement (system operates the same as it did before the commit).  The
default value of 1 enables enforcement by removing pages from the process
pmap and deactivating them passively.  The value 2 enables enforcement and
actively pages data out and frees pages.   Mode 2 can cause excessive
paging and wear out your SSD so it is not currently recommended.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161228/4ff8510c/attachment-0003.htm>