GNOME Bugzilla – Bug 323354
Support a dynamic number of CPUs
Last modified: 2018-01-10 19:44:25 UTC
In include/glibtop/cpu.h: #define GLIBTOP_NCPU 4 struct _glibtop_cpu { guint64 flags, total, /* GLIBTOP_CPU_TOTAL */ user, /* GLIBTOP_CPU_USER */ nice, /* GLIBTOP_CPU_NICE */ sys, /* GLIBTOP_CPU_SYS */ idle, /* GLIBTOP_CPU_IDLE */ iowait, /* GLIBTOP_CPU_IOWAIT */ irq, /* GLIBTOP_CPU_IRQ */ softirq, /* GLIBTOP_CPU_SOFTIRQ */ frequency, /* GLIBTOP_CPU_FREQUENCY */ xcpu_total [GLIBTOP_NCPU], /* GLIBTOP_XCPU_TOTAL */ xcpu_user [GLIBTOP_NCPU], /* GLIBTOP_XCPU_USER */ xcpu_nice [GLIBTOP_NCPU], /* GLIBTOP_XCPU_NICE */ xcpu_sys [GLIBTOP_NCPU], /* GLIBTOP_XCPU_SYS */ xcpu_idle [GLIBTOP_NCPU], /* GLIBTOP_XCPU_IDLE */ xcpu_iowait [GLIBTOP_NCPU], /* GLIBTOP_XCPU_IOWAIT */ xcpu_irq [GLIBTOP_NCPU], /* GLIBTOP_XCPU_IRQ */ xcpu_softirq [GLIBTOP_NCPU], /* GLIBTOP_XCPU_SOFTIRQ */ xcpu_flags; /* GLIBTOP_XCPU_IDLE */ }; Changing the number of CPUs would break ABI, but it would be a very good idea to make this dynamic, rather than hard-coded as now.
My guess is that we should: 1. increase GLIBTOP_NCPU to be at least GLIBTOP_MAX_CPU (and increasing that one as well, possibly to 32) 2. add an unsigned int 32 member to _glibtop_cpu with the number of CPUs 3. Make all the GLIBTOP_NCPU dependant struct members pointers 3. add accessors and convenience functions to add new CPUs, access items in each of the arrays (checking that we don't get > nr_cpus), and one to free the _glibtop_cpu struct altogether
We agreed, on IRC, that it should be enough to increase GLIBTOP_NCPU (to 16), and add a flag in glibtop_global_server to tell front-ends that the machine has more CPUs than what libgtop supports.
there's no flag in glibtop_server. but it has a member 'int ncpu'. I think it would be OK to set it to something >= GLIBTOP_NCPU. Then we could use glibtop_cpu.flags & GLIBTOP_CPU_TOO_MANY if server->ncpu >= GLIBTOP_NCPU
It's such a mess :/ I've realized that server->ncpu is (mis)used (for linux) in : - glibtop_get_proc_time - glibtop_get_sysinfo - glibtop_open - glibtop_get_cpu This should be fixed. I first thought i should add a GLIBTOP_CPU_TOO_MANY to glibtop_cpu.flags but has the number of cpu is used in 4 differents places, i think it would be easier to : - add a 'guint real_ncpu' to 'glibtop_server' which would give the real number of cpus on the system. 'ncpu' (always < GLIBTOP_NCPU) would be the number of monitored cpus. - warn in 'glibtop_open' if there are too many cpu - and increase GLIBTOP_NCPU to 16 (or 32 ?). Would make sizeof(glibtop_cpu) = 1112B (or 2136B). I'll then fix the 'server->ncpu' issues. What do you think about this ?
Rather than hard code the number of CPUs, I would do something like this: struct _glibtop_cpu { guint64 flags, total, /* GLIBTOP_CPU_TOTAL */ user, /* GLIBTOP_CPU_USER */ nice, /* GLIBTOP_CPU_NICE */ sys, /* GLIBTOP_CPU_SYS */ idle, /* GLIBTOP_CPU_IDLE */ iowait, /* GLIBTOP_CPU_IOWAIT */ irq, /* GLIBTOP_CPU_IRQ */ softirq, /* GLIBTOP_CPU_SOFTIRQ */ frequency, /* GLIBTOP_CPU_FREQUENCY */ struct _glibtop_xcpu *head, /* first CPU in the list */ xcpu_flags; /* GLIBTOP_XCPU_IDLE */ }; struct _glibtop_xcpu { xcpu_total [GLIBTOP_NCPU], /* GLIBTOP_XCPU_TOTAL */ xcpu_user [GLIBTOP_NCPU], /* GLIBTOP_XCPU_USER */ xcpu_nice [GLIBTOP_NCPU], /* GLIBTOP_XCPU_NICE */ xcpu_sys [GLIBTOP_NCPU], /* GLIBTOP_XCPU_SYS */ xcpu_idle [GLIBTOP_NCPU], /* GLIBTOP_XCPU_IDLE */ xcpu_iowait [GLIBTOP_NCPU], /* GLIBTOP_XCPU_IOWAIT */ xcpu_irq [GLIBTOP_NCPU], /* GLIBTOP_XCPU_IRQ */ xcpu_softirq [GLIBTOP_NCPU], /* GLIBTOP_XCPU_SOFTIRQ */ struct _glibtop_xcpu *next, /* next CPU in the list */ }; This is an API change, but shouldn't affect too many modules. It would make the structure future proof, and not require people with only 1 CPU to allocate 2K or 4K or 8K for 16 or 32 or 64 possible CPUs.
* configure.in: * glibtop.h: * include/glibtop/cpu.h: * include/glibtop/procmap.h: Increased GLIBTOP_NCPU to 32. Added real_ncpu to struct glibtop. Added smaps members to glibtop_map_entry. Bumped version number to 2.13.0 Broke ABI again. Sorry. Too late. This would require a lot of work because it'd need to update the marshalling code between client and daemon. And test it. And patch every arch. And test it. Allocating 2K is not a problem at all. But as you noticed, there's a real need for a new API. May be a libgtop-ng :)
I don't think it's too late at all. This temporary fix will suffice, but it may be possible to prepare a more complete fix in the 2.14 timeframe. I think once the patch is prepared for a few platforms, the rest will be easy to port. I am retitling this bug to reflect the new requirement and marking it as an enhancement.
which glibtop gives 8 CPU support? I can't seem to get a concise answer for this. After a session in gdb found only 4 are currently supported though /proc/stat show all CPU(s) info.
Well, 2.13.0 and higher should handle up to 32. Which version are you running ? Could you also please attach your /proc/cpuinfo ?
and /proc/stat of course :)
2.4.7 is what we are using.... That answers my question thanks! [ /proc/cpuinfo ] processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz stepping : 7 cpu MHz : 2667.391 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr bogomips : 5337.05 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz stepping : 7 cpu MHz : 2667.391 cache size : 4096 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr bogomips : 5332.60 processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz stepping : 7 cpu MHz : 2667.391 cache size : 4096 KB physical id : 0 siblings : 4 core id : 2 cpu cores : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr bogomips : 5332.67 processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz stepping : 7 cpu MHz : 2667.391 cache size : 4096 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr bogomips : 5332.66 processor : 4 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz stepping : 7 cpu MHz : 2667.391 cache size : 4096 KB physical id : 1 siblings : 4 core id : 4 cpu cores : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr bogomips : 5332.67 processor : 5 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz stepping : 7 cpu MHz : 2667.391 cache size : 4096 KB physical id : 1 siblings : 4 core id : 5 cpu cores : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr bogomips : 5332.70 processor : 6 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz stepping : 7 cpu MHz : 2667.391 cache size : 4096 KB physical id : 1 siblings : 4 core id : 6 cpu cores : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr bogomips : 5332.71 processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz stepping : 7 cpu MHz : 2667.391 cache size : 4096 KB physical id : 1 siblings : 4 core id : 7 cpu cores : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr bogomips : 5332.71 [ /proc/stat ] cpu 37436 12144 54093 13212905 38681 1618 0 cpu0 7298 1186 9232 1616959 33580 1464 0 cpu1 15391 1182 15952 1636011 901 153 0 cpu2 3960 1009 7210 1654539 2874 0 0 cpu3 2939 2020 8125 1656273 234 0 0 cpu4 1758 1837 4167 1661610 220 0 0 cpu5 3875 2744 4247 1658444 281 0 0 cpu6 1160 1836 2686 1663532 377 0 0 cpu7 1051 325 2471 1665534 210 0 0 intr 22197011 16698834 10 0 7 7 232962 7 0 1 0 0 339307 66 0 149556 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3064790 0 0 0 0 0 0 0 1711464 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ctxt 146994984 btime 1201269593 processes 68384 procs_running 1 procs_blocked 0
It should be safe to increase GLIBTOP_NCPU to 8 as long as you rebuild everything against this patched version because it breaks the ABI.
Currently GLIBTOP_NCPU is 32, and some people have more than 32 cores, so we still have this problem. The max number should not be hardcoded indeed.
*** Bug 677204 has been marked as a duplicate of this bug. ***
Hi, this is Jason from HP. I'm not sure what's the status of this problem, but the value of GLIBTOP_NCPU is still 32 in the latest version. Here I'd like to update some information about this problem, for we've been really affected by the limitation of these cpu related definitions. We have some big machines which run linux. The cpu number of these machines is normally from 240~480, maybe bigger later. So when using gnome-system-monitor to see the cpu usage statics, it will not display the correct cpu number. I looked into the code of libgtop and gnome-system-monitor, and the root cause is quite clear. It's limited by two Macros: GLIBTOP_NCPU and BUFSIZ For a typical machine of 480 CPUs, the value of GLIBTOP_NCPU should be more than 480. It's defined in libgtop/include/glibtop/cpu.h The macro BUFSIZ is used to read /proc/stat into a buffer. The current size is 8192, which is a recursive definition and finally defined in /usr/include/_G_config.h. In my environment of 480 cpus, the size of /proc/stat is about 40k, which changed according to the run time. So it will be truncated by the edge of 8192. As a workround, it's quite easy to have it fixed by expanding the value of these two macros. But as the topic said, a dynamic size identification solution is the right way to go. I will attched some screen shots of the problem in my environments, and my workround result. My libgtop version is 2.28 Thanks.
Created attachment 262995 [details] expanded GLIBTOP_NCPU to 256 I expanded GLIBTOP_NCPU to 256, and the displayed cpu is increased. But still limited by the BUFSIZ of 8192. I checked the truncated buffer of /proc/stat, it's actually is 8192 bytes.
Created attachment 262996 [details] expanded GLIBTOP_NCPU to 512, and BUFSIZ to 81920 Now, I expanded GLIBTOP_NCPU to 256, and BUFSIZ to 81920. It fixed our problem, just as a workround.
For providing a completed solution will affect several modules, is there a schedule or plan to do it?
I'm thinking about migrating the whole cpu info array (most dynamic arrays in fact, with CPU being the most important) to GList, glib already being a dependency of libgtop. That of course takes time, and I unfortunately can't give a real estimation on how long it will take, as libgtop supports multiple operating systems, and I only have a Linux dev environment set up, so I'll have to either find someone willing to port the changes to the other supported OSs, or set up VMs for myself, both of these taking quite some time. System monitor and other projects are using libgtop (there are some desktop extensions, like Gnome Shell extensions, XFCE plugins and Ubuntu indicators), so these will also need an update (maybe there also are some others) In the meantime, I am considering temporarily accepting your patch, until the Glist migration gets in place, or I manage to find an acceptable transition method. Would accepting your patch in libgtop help you out at least temporarily?
Thanks Robert for your comments. Here I attached this temporary patch below. Please review it. Thanks.
Created attachment 263971 [details] extending-max-cpu-number.patch
Thanks for the patch, pushed to head, will push a release with the next GNOME release train.
Thank you Robert.
Created attachment 274211 [details] [review] Fix libtool versioning Hello Jason and Robert, you need to bump the soname otherwise the ABI breakage gets unnoticed and leads to segfaults because of the glibtop_cpu struct size change. I'm not sure you should have done it during the 2.28 cycle, but here's the patch to bump the soname/libtool versioning.
Review of attachment 274211 [details] [review]: Thank you for the patch, Benoît. I have been thinking on whether I should increase the "libgtop_current", but I have decided not to (obviously, that was the wrong choice). However, this has been done during the 2.28 cycle, but I have prepared only one release, and that was 8 months ago, so nothing's broken yet in any stable released version, and with your patch committed, it should not happen in the future either.
OK then you may want to increase the version to 3.<something> and even branch.
Hello, I recently used gnome-system-monitor 2.28 on RHEL6 for a system with 480 threads and found that it wasn't supporting that high number of threads. I rebuilt the RPM package of libgtop2-2.28.0-4 with the following modifications in order to have it work correctly: perl -pi -e 's/char buffer \[BUFSIZ\]/char buffer [10*BUFSIZ]/' sysdeps/linux/cpu.c perl -pi -e 's/char buffer \[BUFSIZ\]/char buffer [10*BUFSIZ]/' sysdeps/linux/open.c perl -pi -e 's/char buffer \[16384\]/char buffer [40*16384]/' sysdeps/linux/sysinfo.c perl -pi -e 's/#define GLIBTOP_NCPU\s*256/#define GLIBTOP_NCPU 512/' ./include/glibtop/cpu.h We checked this was working fine afterwards. Probably not a final solution as mentioned earlier, but a work around to a current limitation for people having such large systems. RHEL is following this at this URL: https://bugzilla.redhat.com/show_bug.cgi?id=726397 and the one mentioned in that BR.
Release 2.30 has GLIBTOP_NCPU set to 1024. Yes it works, just keep in mind that it breaks the ABI so you need to rebuild all the packages depending on libgtop.
Review of attachment 274211 [details] [review]: Committed a long time ago, marking as such.
Created attachment 343827 [details] [review] Patch to dynamically allocate memory when reading /proc/cpuinfo We've run into a related issue with the way libgtop determines the number of CPUs. A 64k buffer is allocated to hold the contents of /proc/cpuinfo, and the contents of this buffer are parsed to count the number of cores. On a server with a large number of cores (>64 or so), /proc/cpuinfo can contain more than 64k of data, so libgtop reports fewer cores than are present.
Created attachment 343943 [details] [review] Use g_get_file_contents to read /proc/cpuinfo Hello, here's a shorter version using glib g_get_file_contents.
Review of attachment 343943 [details] [review]: This was pushed to master as commit b0ab056e99d83246475ef097dd27c04fafd49d70 some time ago, just after 2.34.2, and was in the 2.35.90 release.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/libgtop/issues/4.