Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[PATCH v4 7/7] x86/tlb: add tlb_flushall_factor into sysfs for user testing/tuning

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


alex.shi at intel

May 9, 2012, 10:00 PM

Post #1 of 7 (125 views)
Permalink
[PATCH v4 7/7] x86/tlb: add tlb_flushall_factor into sysfs for user testing/tuning

kernel will replace cr3 rewrite with invlpg when
tlb_flush_entries <= active_tlb_entries / 2^tlb_flushall_factor
if tlb_flushall_factor is -1, kernel won't do this replacement.

User can modify its value according to specific applications.

Signed-off-by: Alex Shi <alex.shi [at] intel>
---
Documentation/ABI/testing/sysfs-devices-system-cpu | 12 ++++++
arch/x86/Kconfig.debug | 11 ++++++
arch/x86/kernel/cpu/common.c | 37 ++++++++++++++++++++
drivers/base/cpu.c | 4 ++
include/linux/cpu.h | 4 ++
5 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index e7be75b..05f8eb7 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -78,6 +78,18 @@ Description: Dynamic addition and removal of CPU's. This is not hotplug
the system. Information writtento the file to remove CPU's
is architecture specific.

+What: /sys/devices/system/cpu/tlb_flushall_factor
+Date: May 2012
+Contact: Linux kernel mailing list <linux-kernel [at] vger>
+Description: tlb_flushall_factor show and setting interface
+ tlb_flushall_factor shows the balance point in replacing cr3
+ writting with multiple 'invlpg'. It will do this replacement
+ when flush_tlb_lines <= active_lines/2^tlb_flushall_factor
+ If tlb_flushall_factor is -1, means the replacement will be
+ disabled.
+
+ User can set this for the specific CPU or application.
+
What: /sys/devices/system/cpu/cpu#/node
Date: October 2009
Contact: Linux memory management mailing list <linux-mm [at] kvack>
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index e46c214..5b87493 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -129,6 +129,17 @@ config DOUBLEFAULT
option saves about 4k and might cause you much additional grey
hair.

+config DEBUG_TLBFLUSH
+ bool "Enable user level tlb flush all setting"
+ depends on DEBUG_KERNEL && (X86_64 || X86_INVLPG)
+ ---help---
+ This option allows user tune tlb_flushall_factor knob that under
+ /sys/devices/system/cpu, set to -1 means do tlb flush all for any
+ multiple tlb lines evacuation demand. Otherwise kernel will use
+ multiple 'invlpg' for the demand when
+ flush_lines <= active_tlb_lines / 2^tlb_flushall_factor
+ If in doubt, say "N"
+
config IOMMU_DEBUG
bool "Enable IOMMU debugging"
depends on GART_IOMMU && DEBUG_KERNEL
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8879d20..d1986c6 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -481,6 +481,43 @@ void __cpuinit cpu_detect_tlb(struct cpuinfo_x86 *c)
tlb_flushall_factor);
}

+#ifdef CONFIG_DEBUG_TLBFLUSH
+static ssize_t __tlb_flushall_factor_store(const char *buf,
+ size_t count, int smt)
+{
+ short factor = 0;
+
+ if (sscanf(buf, "%hd", &factor) != 1)
+ return -EINVAL;
+
+ tlb_flushall_factor = factor;
+
+ return count;
+}
+
+static ssize_t tlb_flushall_factor_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ return sprintf(buf, "%hd\n", tlb_flushall_factor);
+}
+static ssize_t tlb_flushall_factor_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ return __tlb_flushall_factor_store(buf, count, 0);
+}
+
+static DEVICE_ATTR(tlb_flushall_factor, 0644,
+ tlb_flushall_factor_show,
+ tlb_flushall_factor_store);
+
+int __init create_sysfs_tlb_flushall_factor(struct device *dev)
+{
+ return device_create_file(dev, &dev_attr_tlb_flushall_factor);
+}
+#endif
+
void __cpuinit detect_ht(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_X86_HT
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index adf937b..dc0f77b 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -331,6 +331,10 @@ void __init cpu_dev_init(void)

cpu_dev_register_generic();

+#ifdef CONFIG_DEBUG_TLBFLUSH
+ create_sysfs_tlb_flushall_factor(cpu_subsys.dev_root);
+#endif
+
#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
sched_create_sysfs_power_savings_entries(cpu_subsys.dev_root);
#endif
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index ee28844..3eb85a5 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -36,6 +36,10 @@ extern void cpu_remove_dev_attr(struct device_attribute *attr);
extern int cpu_add_dev_attr_group(struct attribute_group *attrs);
extern void cpu_remove_dev_attr_group(struct attribute_group *attrs);

+#ifdef CONFIG_DEBUG_TLBFLUSH
+extern int create_sysfs_tlb_flushall_factor(struct device *dev);
+#endif
+
extern int sched_create_sysfs_power_savings_entries(struct device *dev);

#ifdef CONFIG_HOTPLUG_CPU
--
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


bp at amd64

May 10, 2012, 1:27 AM

Post #2 of 7 (114 views)
Permalink
Re: [PATCH v4 7/7] x86/tlb: add tlb_flushall_factor into sysfs for user testing/tuning [In reply to]

On Thu, May 10, 2012 at 01:00:13PM +0800, Alex Shi wrote:
> kernel will replace cr3 rewrite with invlpg when
> tlb_flush_entries <= active_tlb_entries / 2^tlb_flushall_factor
> if tlb_flushall_factor is -1, kernel won't do this replacement.
>
> User can modify its value according to specific applications.
>
> Signed-off-by: Alex Shi <alex.shi [at] intel>

Just minor nitpicks below.

> ---
> Documentation/ABI/testing/sysfs-devices-system-cpu | 12 ++++++
> arch/x86/Kconfig.debug | 11 ++++++
> arch/x86/kernel/cpu/common.c | 37 ++++++++++++++++++++
> drivers/base/cpu.c | 4 ++
> include/linux/cpu.h | 4 ++
> 5 files changed, 68 insertions(+), 0 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index e7be75b..05f8eb7 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -78,6 +78,18 @@ Description: Dynamic addition and removal of CPU's. This is not hotplug
> the system. Information writtento the file to remove CPU's
> is architecture specific.
>
> +What: /sys/devices/system/cpu/tlb_flushall_factor
> +Date: May 2012
> +Contact: Linux kernel mailing list <linux-kernel [at] vger>
> +Description: tlb_flushall_factor show and setting interface
> + tlb_flushall_factor shows the balance point in replacing cr3
> + writting with multiple 'invlpg'. It will do this replacement
> + when flush_tlb_lines <= active_lines/2^tlb_flushall_factor
> + If tlb_flushall_factor is -1, means the replacement will be
> + disabled.
> +
> + User can set this for the specific CPU or application.
> +
> What: /sys/devices/system/cpu/cpu#/node
> Date: October 2009
> Contact: Linux memory management mailing list <linux-mm [at] kvack>
> diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
> index e46c214..5b87493 100644
> --- a/arch/x86/Kconfig.debug
> +++ b/arch/x86/Kconfig.debug
> @@ -129,6 +129,17 @@ config DOUBLEFAULT
> option saves about 4k and might cause you much additional grey
> hair.
>
> +config DEBUG_TLBFLUSH
> + bool "Enable user level tlb flush all setting"

bool "Set top limit of TLB entries to flush one-by-one"

> + depends on DEBUG_KERNEL && (X86_64 || X86_INVLPG)
> + ---help---
> + This option allows user tune tlb_flushall_factor knob that under

allows the user to tune the ... (remove "that")

> + /sys/devices/system/cpu, set to -1 means do tlb flush all for any

. Set to -1 means to flush the whole TLB for any

> + multiple tlb lines evacuation demand. Otherwise kernel will use
> + multiple 'invlpg' for the demand when
> + flush_lines <= active_tlb_lines / 2^tlb_flushall_factor
> + If in doubt, say "N"
> +
> config IOMMU_DEBUG
> bool "Enable IOMMU debugging"
> depends on GART_IOMMU && DEBUG_KERNEL
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 8879d20..d1986c6 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -481,6 +481,43 @@ void __cpuinit cpu_detect_tlb(struct cpuinfo_x86 *c)
> tlb_flushall_factor);
> }
>
> +#ifdef CONFIG_DEBUG_TLBFLUSH
> +static ssize_t __tlb_flushall_factor_store(const char *buf,
> + size_t count, int smt)
> +{
> + short factor = 0;
> +
> + if (sscanf(buf, "%hd", &factor) != 1)
> + return -EINVAL;

This means only single-digit factors, right?

Why not use kstrtoul?

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


gregkh at linuxfoundation

May 10, 2012, 8:13 AM

Post #3 of 7 (115 views)
Permalink
Re: [PATCH v4 7/7] x86/tlb: add tlb_flushall_factor into sysfs for user testing/tuning [In reply to]

On Thu, May 10, 2012 at 01:00:13PM +0800, Alex Shi wrote:
> kernel will replace cr3 rewrite with invlpg when
> tlb_flush_entries <= active_tlb_entries / 2^tlb_flushall_factor
> if tlb_flushall_factor is -1, kernel won't do this replacement.
>
> User can modify its value according to specific applications.
>
> Signed-off-by: Alex Shi <alex.shi [at] intel>
> ---
> Documentation/ABI/testing/sysfs-devices-system-cpu | 12 ++++++
> arch/x86/Kconfig.debug | 11 ++++++
> arch/x86/kernel/cpu/common.c | 37 ++++++++++++++++++++
> drivers/base/cpu.c | 4 ++
> include/linux/cpu.h | 4 ++
> 5 files changed, 68 insertions(+), 0 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index e7be75b..05f8eb7 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -78,6 +78,18 @@ Description: Dynamic addition and removal of CPU's. This is not hotplug
> the system. Information writtento the file to remove CPU's
> is architecture specific.
>
> +What: /sys/devices/system/cpu/tlb_flushall_factor
> +Date: May 2012
> +Contact: Linux kernel mailing list <linux-kernel [at] vger>
> +Description: tlb_flushall_factor show and setting interface
> + tlb_flushall_factor shows the balance point in replacing cr3
> + writting with multiple 'invlpg'. It will do this replacement
> + when flush_tlb_lines <= active_lines/2^tlb_flushall_factor
> + If tlb_flushall_factor is -1, means the replacement will be
> + disabled.
> +
> + User can set this for the specific CPU or application.

Nowhere do you say this is x86 only, please fix that.

> +
> What: /sys/devices/system/cpu/cpu#/node
> Date: October 2009
> Contact: Linux memory management mailing list <linux-mm [at] kvack>
> diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
> index e46c214..5b87493 100644
> --- a/arch/x86/Kconfig.debug
> +++ b/arch/x86/Kconfig.debug
> @@ -129,6 +129,17 @@ config DOUBLEFAULT
> option saves about 4k and might cause you much additional grey
> hair.
>
> +config DEBUG_TLBFLUSH
> + bool "Enable user level tlb flush all setting"
> + depends on DEBUG_KERNEL && (X86_64 || X86_INVLPG)
> + ---help---
> + This option allows user tune tlb_flushall_factor knob that under
> + /sys/devices/system/cpu, set to -1 means do tlb flush all for any
> + multiple tlb lines evacuation demand. Otherwise kernel will use
> + multiple 'invlpg' for the demand when
> + flush_lines <= active_tlb_lines / 2^tlb_flushall_factor
> + If in doubt, say "N"

Yeah, another tunable that no one knows how to use.

Really, why is this here at all? As others pointed out, this really
looks like a debugging thing that almost no one will ever need, so
please, put it in debugfs.

> +
> config IOMMU_DEBUG
> bool "Enable IOMMU debugging"
> depends on GART_IOMMU && DEBUG_KERNEL
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 8879d20..d1986c6 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -481,6 +481,43 @@ void __cpuinit cpu_detect_tlb(struct cpuinfo_x86 *c)
> tlb_flushall_factor);
> }
>
> +#ifdef CONFIG_DEBUG_TLBFLUSH
> +static ssize_t __tlb_flushall_factor_store(const char *buf,
> + size_t count, int smt)
> +{
> + short factor = 0;
> +
> + if (sscanf(buf, "%hd", &factor) != 1)
> + return -EINVAL;
> +
> + tlb_flushall_factor = factor;
> +
> + return count;
> +}
> +
> +static ssize_t tlb_flushall_factor_show(struct device *dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + return sprintf(buf, "%hd\n", tlb_flushall_factor);
> +}
> +static ssize_t tlb_flushall_factor_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + return __tlb_flushall_factor_store(buf, count, 0);
> +}
> +
> +static DEVICE_ATTR(tlb_flushall_factor, 0644,
> + tlb_flushall_factor_show,
> + tlb_flushall_factor_store);
> +
> +int __init create_sysfs_tlb_flushall_factor(struct device *dev)
> +{
> + return device_create_file(dev, &dev_attr_tlb_flushall_factor);
> +}
> +#endif
> +
> void __cpuinit detect_ht(struct cpuinfo_x86 *c)
> {
> #ifdef CONFIG_X86_HT
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index adf937b..dc0f77b 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -331,6 +331,10 @@ void __init cpu_dev_init(void)
>
> cpu_dev_register_generic();
>
> +#ifdef CONFIG_DEBUG_TLBFLUSH
> + create_sysfs_tlb_flushall_factor(cpu_subsys.dev_root);
> +#endif

Do the #ifdef in the .h file, not the .c file please, no matter how you
end up doing this (debugfs vs. sysfs.)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


alex.shi at intel

May 10, 2012, 5:52 PM

Post #4 of 7 (117 views)
Permalink
Re: [PATCH v4 7/7] x86/tlb: add tlb_flushall_factor into sysfs for user testing/tuning [In reply to]

>> +#ifdef CONFIG_DEBUG_TLBFLUSH

>> +static ssize_t __tlb_flushall_factor_store(const char *buf,
>> + size_t count, int smt)
>> +{
>> + short factor = 0;
>> +
>> + if (sscanf(buf, "%hd", &factor) != 1)
>> + return -EINVAL;
>
> This means only single-digit factors, right?


No, you can try '32' '16' etc. not a 'single-digit'.

>
> Why not use kstrtoul?


any advantage of this?

>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


alex.shi at intel

May 10, 2012, 5:59 PM

Post #5 of 7 (116 views)
Permalink
Re: [PATCH v4 7/7] x86/tlb: add tlb_flushall_factor into sysfs for user testing/tuning [In reply to]

> Nowhere do you say this is x86 only, please fix that.


sure.

>
> Yeah, another tunable that no one knows how to use.
>
> Really, why is this here at all? As others pointed out, this really
> looks like a debugging thing that almost no one will ever need, so
> please, put it in debugfs.


Ok.

>
>
> Do the #ifdef in the .h file, not the .c file please, no matter how you
> end up doing this (debugfs vs. sysfs.)


Ok.

>
> thanks,
>
> greg k-h
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


bp at amd64

May 11, 2012, 2:51 AM

Post #6 of 7 (115 views)
Permalink
Re: [PATCH v4 7/7] x86/tlb: add tlb_flushall_factor into sysfs for user testing/tuning [In reply to]

On Fri, May 11, 2012 at 08:52:09AM +0800, Alex Shi wrote:
> >> +#ifdef CONFIG_DEBUG_TLBFLUSH
>
> >> +static ssize_t __tlb_flushall_factor_store(const char *buf,
> >> + size_t count, int smt)
> >> +{
> >> + short factor = 0;
> >> +
> >> + if (sscanf(buf, "%hd", &factor) != 1)
> >> + return -EINVAL;
> >
> > This means only single-digit factors, right?
>
> No, you can try '32' '16' etc. not a 'single-digit'.

Ah, misread sscanf, nevermind.

> > Why not use kstrtoul?
>
> any advantage of this?

Well, sscanf uses simple_strto* and those miss overflow checks etc, see
33ee3b2e2eb9b4b6c64dcf9ed66e2ac3124e748c for details.

Btw, there are other kstrto* functions which you could use to fit better
the argument type and size passed to tlb_flushall_factor.


--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


alex.shi at intel

May 11, 2012, 5:53 AM

Post #7 of 7 (115 views)
Permalink
Re: [PATCH v4 7/7] x86/tlb: add tlb_flushall_factor into sysfs for user testing/tuning [In reply to]

On 05/11/2012 05:51 PM, Borislav Petkov wrote:

> On Fri, May 11, 2012 at 08:52:09AM +0800, Alex Shi wrote:
>>>> +#ifdef CONFIG_DEBUG_TLBFLUSH
>>
>>>> +static ssize_t __tlb_flushall_factor_store(const char *buf,
>>>> + size_t count, int smt)
>>>> +{
>>>> + short factor = 0;
>>>> +
>>>> + if (sscanf(buf, "%hd", &factor) != 1)
>>>> + return -EINVAL;
>>>
>>> This means only single-digit factors, right?
>>
>> No, you can try '32' '16' etc. not a 'single-digit'.
>
> Ah, misread sscanf, nevermind.
>
>>> Why not use kstrtoul?
>>
>> any advantage of this?
>
> Well, sscanf uses simple_strto* and those miss overflow checks etc, see
> 33ee3b2e2eb9b4b6c64dcf9ed66e2ac3124e748c for details.


Thanks for reminder!

>
> Btw, there are other kstrto* functions which you could use to fit better
> the argument type and size passed to tlb_flushall_factor.
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.