From f8891e5e1f93a128c3900f82035e8541357896a7 Mon Sep 17 00:00:00 2001 From: Christoph Lameter Date: Fri, 30 Jun 2006 01:55:45 -0700 Subject: [PATCH] Light weight event counters The remaining counters in page_state after the zoned VM counter patches have been applied are all just for show in /proc/vmstat. They have no essential function for the VM. We use a simple increment of per cpu variables. In order to avoid the most severe races we disable preempt. Preempt does not prevent the race between an increment and an interrupt handler incrementing the same statistics counter. However, that race is exceedingly rare, we may only loose one increment or so and there is no requirement (at least not in kernel) that the vm event counters have to be accurate. In the non preempt case this results in a simple increment for each counter. For many architectures this will be reduced by the compiler to a single instruction. This single instruction is atomic for i386 and x86_64. And therefore even the rare race condition in an interrupt is avoided for both architectures in most cases. The patchset also adds an off switch for embedded systems that allows a building of linux kernels without these counters. The implementation of these counters is through inline code that hopefully results in only a single instruction increment instruction being emitted (i386, x86_64) or in the increment being hidden though instruction concurrency (EPIC architectures such as ia64 can get that done). Benefits: - VM event counter operations usually reduce to a single inline instruction on i386 and x86_64. - No interrupt disable, only preempt disable for the preempt case. Preempt disable can also be avoided by moving the counter into a spinlock. - Handling is similar to zoned VM counters. - Simple and easily extendable. - Can be omitted to reduce memory use for embedded use. References: RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=113512330605497&w=2 RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=114988082814934&w=2 local_t http://marc.theaimsgroup.com/?l=linux-kernel&m=114991748606690&w=2 V2 http://marc.theaimsgroup.com/?t=115014808400007&r=1&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767022346&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115047968808926&w=2 Signed-off-by: Christoph Lameter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/s390/appldata/appldata_base.c | 1 - arch/s390/appldata/appldata_mem.c | 20 ++++++++++---------- 2 files changed, 10 insertions(+), 11 deletions(-) (limited to 'arch/s390/appldata') diff --git a/arch/s390/appldata/appldata_base.c b/arch/s390/appldata/appldata_base.c index 61bc44626c0..2476ca739c1 100644 --- a/arch/s390/appldata/appldata_base.c +++ b/arch/s390/appldata/appldata_base.c @@ -766,7 +766,6 @@ unsigned long nr_iowait(void) #endif /* MODULE */ EXPORT_SYMBOL_GPL(si_swapinfo); EXPORT_SYMBOL_GPL(nr_threads); -EXPORT_SYMBOL_GPL(get_full_page_state); EXPORT_SYMBOL_GPL(nr_running); EXPORT_SYMBOL_GPL(nr_iowait); //EXPORT_SYMBOL_GPL(nr_context_switches); diff --git a/arch/s390/appldata/appldata_mem.c b/arch/s390/appldata/appldata_mem.c index 180ba79a626..4811e2dac86 100644 --- a/arch/s390/appldata/appldata_mem.c +++ b/arch/s390/appldata/appldata_mem.c @@ -107,21 +107,21 @@ static void appldata_get_mem_data(void *data) * serialized through the appldata_ops_lock and can use static */ static struct sysinfo val; - static struct page_state ps; + unsigned long ev[NR_VM_EVENT_ITEMS]; struct appldata_mem_data *mem_data; mem_data = data; mem_data->sync_count_1++; - get_full_page_state(&ps); - mem_data->pgpgin = ps.pgpgin >> 1; - mem_data->pgpgout = ps.pgpgout >> 1; - mem_data->pswpin = ps.pswpin; - mem_data->pswpout = ps.pswpout; - mem_data->pgalloc = ps.pgalloc_high + ps.pgalloc_normal + - ps.pgalloc_dma; - mem_data->pgfault = ps.pgfault; - mem_data->pgmajfault = ps.pgmajfault; + all_vm_events(ev); + mem_data->pgpgin = ev[PGPGIN] >> 1; + mem_data->pgpgout = ev[PGPGOUT] >> 1; + mem_data->pswpin = ev[PSWPIN]; + mem_data->pswpout = ev[PSWPOUT]; + mem_data->pgalloc = ev[PGALLOC_HIGH] + ev[PGALLOC_NORMAL] + + ev[PGALLOC_DMA]; + mem_data->pgfault = ev[PGFAULT]; + mem_data->pgmajfault = ev[PGMAJFAULT]; si_meminfo(&val); mem_data->sharedram = val.sharedram; -- cgit