Runtime power saving on Linux - not all CPU use is equal
Posted 20 May 2007 at 14:28 UTC by mjg59
As the number of people using Linux in mobile situations increases, the
need to reduce power consumption also increases. While some of these
power savings are obvious (like turning off hardware when it's not being
used), others are more subtle. The combination of recent changes to the
kernel and some small adjustments to existing code can have a
significant effect on your battery life.
Modern processors have a variety of low power modes that can be entered
while idle. These "C states" are numbered, currently ranging from C0 (a
running CPU) to C4 (a deep runtime sleep state). The problem with these
states is that the deeper the sleep, the longer it takes the CPU to wake
up. In order to avoid excessive reductions in performance, the kernel
must keep track of the processor usage pattern and avoid putting the CPU
into a deep sleep mode when it's likely that it'll be needed for more
work in the near future.
Traditionally, the kernel has had a fixed timer tick - that is, a fixed
number of times a second, the timer will generate an interrupt, wake the
kernel and allow processes to be rescheduled. This limits the maximum
amount of time the CPU can remain idle, as this timer will fire even
when the system is otherwise idle. A tick rate of 1000Hz is desirable
for reducing latency, but will also result in the maximum sleep period
being 1ms. This is less than ideal.
2.6.21 introduced dynamic tick functionality. Now, rather than having a
fixed tick interval, if the system is idle the kernel looks at all
outstanding timers and schedules a wakeup in time to answer the next
timer. This allows much longer periods of sleep without compromising
latency. However, for this to be useful it's desirable to have as few
timer wakeup events as possible. The longer the CPU is going to be
asleep, the deeper the sleep state that can be used.
Intel have recently released Powertop, an application that
tracks the causes of wakeups. These wakeups can fall into two categories
- kernel and userspace. Pure userspace timers will usually be due to an
application having a timer to handle some sort of trivial activity, like
blinking a cursor or polling for state. There are a couple of ways to
- Just remove the timer - does it actually need to
exist at all?
- If you're polling for state, consider whether it would
be possible to move to an event driven model. For example, right now
screensavers poll the X server in order to obtain information about
whether the session is idle. The X server has to keep track of this
information anyway, so a simple extension could be added to notify
applications when the user has been idle for a certain period of
- Make sure you're only waking up when you need to. For instance,
you might want to periodically check whether any new email has appeared.
If there's no route to the internet, don't bother.
- If you must use
timers, try to schedule them to go off simultaneously. It's better to
wake the kernel up once and do twice as much work than it is to wake it
up twice. If you're a glib application, g_timeout_add_seconds() will
round to the nearest second, so use it if possible.
Things in the
kernel can be trickier. Kernel interrupts may be appearing because of
some kernel code specifying a timer, but an alternative is that a
userspace application is poking them. Many of the same considerations as
userspace apply here:
- If you don't need especially high-precision
timing, use round_jiffies(). It'll result in synchronisation of many
wakeups, and reduce the overall number.
- If you're being woken up by
userspace, figure out why. HAL polls storage devices every couple of
seconds, generating several interrupts. This is necessary because most
storage devices don't notify the system when media insertion occurs.
Conversely, alsa sends notification events whenever the hardware state
changes, and so it's not necessary for userspace to poll. If you can
provide useful information to interested parties without them having to
repeatedly ask, then do it.
- If your hardware is idle, then do what
you can to quiesce it. The Appletouch pad can be put into a mode where
it doesn't send packets until touched, but once touched will continue
sending packets. Watch for a stream of contentless interrupts, and put
the hardware back to sleep.
- If the hardware can generate interrupts
when something happens, then use them - don't poll
Fixing these issues can range from the trivial
(removing unnecessary timers) to the complicated (teaching gstreamer
about alsa notifications, teaching the mixer applet to listen to signals
from gstreamer). It's all helpful, though. Ideally you want your
processor to be averaging at least 20ms in the deepest C state before
it's woken up again. There's a lot of low-hanging fruit out there, and
every fix improves battery life.
(This article was originally posted here)
Thanks!, posted 20 May 2007 at 15:16 UTC by mako »
Thanks Matthew, this is great! Also wonderful to see good high quality on Advogato.
hack, posted 23 May 2007 at 06:57 UTC by lkcl »
thank you very much for the article, and for articulating clearly some guidelines on power management development.
not least, it gives me an opportunity to ask this important question:
in the context of real-time embedded development, is a monolithic kernel appropriate to use?
and many experienced people - far more experienced than linus torvalds ever is - already know that the answer is "absolutely not, not a snowball in hell's chance is it appropriate".
i know of one very successful entrepreneur whose product line completely failed because the linux kernel, on which the real-time embedded consumer product was based, was blatantly inadequate.
in a simple (and modern - 2007 / 2008) consumer product, the linux kernel simply could not keep up with the data rates and response times required.
the sooner that linus torvalds gets off his high horse, deflates his egoistic lack of trust of what andrew tanenbaum and others have been trying to tell him for years, the sooner we can get on with the advancement of free software at the core level.
in other words - adoption and merging of the l4linux.org project as a compile-time option - REALLY needs to happen.
and the only thing stopping that from happening is sheer arrogance (combination of ignorance and ego).
the patches have been maintained and kept up-to-date for a couple of years, now.
there is no _technical_ reason why those patches should not be adopted.
there is no _procedural_ reason why those patches should not be adopted.
the only reason, as repeatedly stated, is "we (the linux maintainers) don't want linux to be a microkernel". pure and simple and sheer pig-headed and pointless stupidity.
it is NOT the place of linux maintainers to hold up and terminate other people's options, just because they are not fully understood. linux the kernel got WELL beyond one person's control YEARS ago.
face it. accept it. move with it. embrace and let it happen.
i've said enough.
Re: hack, posted 23 May 2007 at 07:40 UTC by bi »
(Jeez, lkcl, get off your totally unrelated hobbyhorse already...)
mjg59: What's the maximum sleep duration beyond which it makes no difference
to the power consumption (e.g. whether I make the cursor go on/off every 1
second or every 1/2 second makes no difference)? Is there any sort of
interface to find this out?
Re: hack, posted 23 May 2007 at 17:12 UTC by zanee »
lkcl: Why even bring up the microkernel argument again? Why? Having a microkernel isn't going to magically get people to make sure their applications aren't wasting power. I'm not sure what you are adding.
I read this article earlier in comments +5 informative.. thanks mjg59.
microkernels, posted 24 May 2007 at 07:22 UTC by lkcl »
zanee: you ask me why i 'bring up the microkernel argument again'.
the 'microkernel argument' is no longer an argument. the simple fact is that hardware has advanced dramatically in the past twenty to thirty years, making the reasons for a monolithic kernel completely irrelevant.
all that stands in the way of a far better alternative, which, if adopted, would incidentally make power management and latency _much_ easier to deal with - hence the relevance to this article - is total fear and total ego.
why fear? because of lack of control (which was already lost, but not acknowledged as lost).
why ego? again - it's related to control. "i'm right. i've been right since iiii developed and/or maintained linux. iii'm in control".
these things are reallly tough to face up to [unless you have no fear and unless you have no ego - and then it's really easy! :) ]
and until they are, linux the operating system is going keep on getting more and more and more and more and more and more and more complex.
and be able to face and adapt adequately to less and less challenges.
sorry i missed out a bit of the "i'm in control" thing. the rest of the sentence should read "... i'm in control. iii've always been right about microkernels, because i didn't take the time to continue to study them properly - and iii was right at the time when i *did* study them, and the hardware was 15 years older, so iii must *still* be right, now."
"... right? right? everyone agrees with me, right? because iiii'm in control, right? i said so on several mailing lists, and everyone follows my lead, right?"
and to repeat, posted 24 May 2007 at 07:29 UTC by lkcl »
and - zanee - i will repeat it again, because you did not acknowledge that you had read or understood it.
there exists the possibility to GIVE PEOPLE THE CHOICE AT COMPILE TIME.
i will repeat that again, so that you get it.
THERE IS THE POSSIBILITY TO MAKE A COMPILE-TIME SWITCH WHEN COMPILING THE LINUX KERNEL TO ENABLE OR DISABLE A MICROKERNEL OPTION.
i will repeat that again so that you can understand it.
IT IS POSSIBLE TO ADD IN CONFIG_L4_MICROKERNEL AS A COMPILE-TIME OPTION.
i will repeat that again in a different way.
IF CONFIG_L4_MICROKERNEL GETS ADDED, AND YOU DO NOT SWITCH IT ON AT COMPILE TIME, YOU WOULD END UP WITH A MONOLITHIC KERNEL.
do you understand this?
please acknowledge that you understand this.
preferably by repeating it in your own words.
this is not a trick question! :) there is no 'challenge' being issued. i simply want to know, from a sample of one, that the words that i am saying are being understood and taken on board.
Re: microkernels, posted 24 May 2007 at 12:57 UTC by bi »
[...] a far better alternative, which, if
would incidentally make power management and latency _much_ easier to
with - hence the relevance to this article - [...]
How so? In what way does a microkernel make it easier to save power or
[...] hardware has advanced dramatically in the past twenty to thirty years,
making the reasons for a monolithic kernel completely
So are you saying this: because CPUs are now faster, therefore we can afford
to waste more CPU cycles, so that we can save CPU cycles
later in an easier way (why it's supposedly easier is still a good
question)? Or do you mean something else? Be more specific please?
And is your above generalization based on any hard facts regarding any
And as zanee pointed out,
microkernel isn't going to magically get people to make sure their
applications aren't wasting power. I'm not sure what you are adding.
Re: and to repeat, posted 24 May 2007 at 13:38 UTC by bi »
IT IS POSSIBLE TO ADD IN CONFIG_L4_MICROKERNEL AS A COMPILE-TIME
That's just fact-free speculation on How I Would Like Things To Be. Sure,
it's probably possible, but is it easy? Do the possible advantages in
integrating L4 outweigh the hassle of doing this integration in the first
place? Will the whole thing save more CPU power than
Just like your above comments on "hardware", your fact-free speculation on
how great things would be if L4's a compile-time option is just that --
fact-free speculation, mentioning not even a single concrete fact. (Unlike
mjg59's article, which was helpful and informative. I
heartily second zanee's +5.)
latency etc., posted 24 May 2007 at 14:38 UTC by lkcl »
briefly (i'm at work)
latency used to be a massive problem which was answered by reducing the overhead by making everything in kernelspace. the hardware design was restricted especially context switching due to the slow speed at which registers etc. could be saved.
now we have hyperthreading support, etc. etc. etc. the latency and context switching issues which made it necessary for SysV initscripts to be single-process one-script-after-the-other as opposed to e.g. depinit and other parallel-startup systems, and also which made it necessary for kernels to be monolithic in design.
all in the name of avoiding context-switching and other translations and layers of indirection (such as virtual memory remapping) as much as possible.
and bi - if i did all the work for you, you wouldn't achieve anything or learn anything.
_i_ know where to find out the facts - but a) i may have missed some and would be delighted to be pointed in the direction of more facts and examples b) you don't - so look for them, just like i did.
so no - i am _not_ going to tell you everything: you can work it out for yourself, after my hints advise you that it _is_ possible.
that way you will learn something, you will get off my back and stop fighting me, and you will go 'cool. hey. you were right. oh - and you missed this very interesting thing i found out too'.
all round, much better, much more healthier interaction.
The article is about new features in Linux that make it possible to
greatly reduce power drain, as well as tools to help identify remaining
What any of this has to do with Linux's suitability for "real-time
embedded development" completely excapes me; indeed many of these
power-saving features actually increase latency because the CPU has to
switch into a runnable state in order to respond to an interrupt, a fact
which you neglected to mention in your completely tangental and
irrelevant (to the article at hand) remarks.
"L4Linux" will not make Linux into a microkernel. Instead it runs
(monolithic) Linux as just another task under the L4 microkernel. This
distinction is crucial, as the "real-time" components are not Linux at
all, and Linux remains the same monolithic Linux as it's always been.
Finally, the "I know where to find it, but I'm not going to tell you"
attitude reeks of ten-year-old [im]maturity -- You're the one advocating
an unbstantiated contrary position so the burden of persuasion is on
Oh, and just because something is possible doesn't mean it's worth doing.
re: latency, etc, posted 24 May 2007 at 17:57 UTC by zanee »
lkcl: all that stands in the way of a far better alternative, which,
if adopted, would incidentally make power management and latency _much_
easier to deal with - hence the relevance to this article - is total
fear and total ego.
My apologies, I was working and just came in for my daily visit. That
said, I get tired of these same old arguments in regards to
communication and speed when it comes to Monolithic and Micro, sometime
last year I told myself I wouldn't discuss this again.
lkcl: THERE IS THE POSSIBILITY TO MAKE A COMPILE-TIME SWITCH WHEN
COMPILING THE LINUX KERNEL TO ENABLE OR DISABLE A MICROKERNEL OPTION.
So. What you are saying is that the linux kernel maintainers should take
it upon themselves to support l4linux because it is an option? What's
preventing the person who wants to run linux ontop of l4 from doing so
right now? Have you taken a look at recent progress here:
We both know that's not going to fly.
lkcl: now we have hyperthreading support, etc. etc. etc. the latency
and context switching issues which made it necessary for SysV
initscripts to be single-process one-script-after-the-other as opposed
to e.g. depinit and other parallel-startup systems, and also which made
it necessary for kernels to be monolithic in design.
Come on, this is bollocks. There are plenty of parallel init systems
including depinit and none forced kernels to be monolithic. You could
argue no one thought of the idea, or there was no software around at the
time to handle parallel handoff of bringing up sub-systems. However, the
problem with microkernels has always been speed and maintenance.
I could understand if you were making the claim that there would be some
subsystem in your new microkernel to manage power efficiently but that
still wouldn't prevent people from wasting power in their apps, which is
what the article was about. It wouldn't make micro kernels any faster
and basically it'd be completely off-topic.
Anyway, seriously this time.. I'm no longer going to debate such issues
anymore. So; to wrap up. I strongly disagree, but I respect your
position to be wrong :-)
You're free to prove me wrong in the interim. In which case when you
bring some concrete evidence or prototype or mockup. I'll return to
debate and discussion.
Comments, posted 24 May 2007 at 19:47 UTC by nymia »
I've scratched my head so many times why kernel space or even perhaps user space code have access to C0-C4. Isn't it supposed to be part of a lowel level code sitting at the micro-code level? If there are no blocks of code going through the pipeline, then some micro-code should be smart enough to put the lights out.
doofus, posted 25 May 2007 at 02:26 UTC by elanthis »
there exists the possibility to GIVE PEOPLE THE CHOICE AT
Ah, the age-old "instead of coming up with the best possible design,
let's take two broken-ass designs and let the user decide which way he'd
prefer his system to work inadequately."
Instead of bloating Linux - an inherently monolithic design - with code
to make it into a half-assed not-really micro-kernel, why not just
*gasp* not use Linux for everything and pick a different kernel/OS for
tasks where Linux isn't the best fit?
If real-time embedded development isn't possible with a monolithic Linux
kernel, then just stop being a doofus and trying to force a square peg
into a round hole. Use something else. Duh.
_i_ know where to find out the facts - but a) i may have
missed some and would be delighted to be pointed in the direction of
more facts and examples b) you don't - so look for them, just like i
It's a wonder the human race manages to improve itself at all with
mindsets like that.
If you don't want to explain it, fine. At least link to the relevant
explanations. You have enough time to waste making post after post
after post on an unrelated article on Advogato, so you clearly have
enough time to post a few informative links and avoid sounding like such
Isn't it supposed to be part of a lowel level code sitting
at the micro-code level? If there are no blocks of code going through
the pipeline, then some micro-code should be smart enough to put the
The CPU doesn't have access to as many points of data as the kernel for
judgding when that is appropriate. Plus the kernel needs to stay in the
know of when that stuff is happening. Better for the kernel to be in
Plus, moving it to the CPU doesn't fix the problem mentioned in the
article - the apps and rivers have to stop being blockheads.
Re: Comments, posted 25 May 2007 at 04:08 UTC by slamb »
I've scratched my head so many times why kernel space or even perhaps user
space code have access to C0-C4. Isn't it supposed to be part of a lowel level code sitting at
the micro-code level? If there are no blocks of code going through the pipeline, then some
micro-code should be smart enough to put the lights out.
Well, mjg59 said:
The problem with these states is that the deeper the sleep, the longer it takes
the CPU to wake up. In order to avoid excessive reductions in performance, the kernel must
keep track of the processor usage pattern and avoid putting the CPU into a deep sleep mode
when it's likely that it'll be needed for more work in the near future.
My guess is that "more savings = longer to wake up" is fundamental, not just how they
designed the interface. So it makes sense to put the logic to go to sleep at a higher level
- the kernel may be better able to guess future usage than the processor, and at the very
least it's more easily updateable if the heuristic is bad.
But in any case, the most direct reason is "because the processor people did it that
way". Even if it made no sense, it sounds like processors have been shipping with this design
for a while, so it's what kernel people code to.
lkcl: Please stop. I'm embarrassed enough by your flame-filled misinformed non
sequiturs when they're in your blog. Don't post them to other people's articles. It reflects
poorly on the community, especially given your "master" rating. Microkernels have almost
nothing to do with power saving or userspace IPC, and Linus was not the one who misdesigned
whatever product you're alluding to. I won't say that Linux is perfect, but I will ask that you
leave its criticism to people who know what they are talking about.
alternatives..., posted 25 May 2007 at 11:48 UTC by lkcl »
Instead of bloating Linux - an inherently monolithic design - with code to make it into a half-assed not-really micro-kernel, why not just *gasp* not use Linux for everything and pick a different kernel/OS for tasks where Linux isn't the best fit?
yep - am giving serious consideration to dumping linux-the-kernel from my life, and finding a viable alternative.
various options include l4-hurd, minix-v3 and others. i've started putting some bug-reports / feature requests in, already, such as the request for an xorg-video-sdl driver, with a view to also putting in a feature request for the l4 or minix or hurd or whoever people to put in a corresponding SDL layer!
one of the problems with using microkernels and then having what used to be an OS being ported effectively into the userspace [of the microkernel] is that security separation in the microkernel is far far better - including not allowing direct access to things like... video cards :)
but - where there's a will, there's a way...
Re: alternatives..., posted 25 May 2007 at 19:26 UTC by bi »
not allowing direct access to things like... video cards :)
Oh, and I thought you preferred L4-* to stock Linux because the latter's
unsuited to "real-time embedded development". Why do you need to worry about
video cards when doing real-time embedded stuff? Or maybe real-time embedded
development isn't exactly the issue you have with Linux after all?
Jeez, this is indeed very embarrasing. You're just mouthing lots of words
with no clue as to what's really going on out there, or even what
specific tasks you're really trying to solve in the first place.
Now, back to power management...
Realistically, the optimal length of time to sleep depends on the characteristics of your system. If there's an application that has to wake up once a second, there's no real reason to avoid having everything else also wake up providing that those wakeups can be synchronised.
The longer the CPU does nothing, the better the powersavings. But when you're waking up very rarely, the absolute gain is pretty small. The correct answer is going to depend on how much power the deep sleep states save on your platform, how long you end up staying in each state and what effect on the user experience different levels of latency have.
As slamb suggests, the state is completely invisible to userspace. If code is executing, you're in C0. The kernel will automatically enter sleep states when there's no code executing. There's several reasons to put it at the kernel level rather than the CPU level - the main one is that the kernel knows when the next wakeup event is due, and so knows the maximum length of the sleep. If that's lower than the latency of exiting a deeper state, it'll just use a shallow sleep state. The other significant issue is that deeper sleep states involve the CPU dropping off the memory bus, which means that it can no longer do DMA snooping. If you do a DMA transfer while in C3 or deeper, really bad things happen. The kernel keeps track of DMA state in order to avoid that from happening.
The only relevant aspect of kernel design when it comes to saving power is how much overhead there is. My understanding of most microkernel design is that there's inherently a certain degree of overhead from having to pass messages through a certain number of layers, and therefore that it's likely that (given everything else being equal) a microkernel would be less suitable by a fairly negligably small amount.
On the other hand, you seem to have been asking about real-time issues. Linux isn't a real-time kernel. Various alternatives exist that provide real-time extensions to Linux, but all of them have made various compromises that make them uninteresting to the mainstream kernel community. Perhaps at some point someone will come up with a solution that doesn't have this problem, and at that point perhaps someone will care.
looove, posted 26 May 2007 at 19:34 UTC by lkcl »
educate me, posted 26 May 2007 at 19:41 UTC by lkcl »
i will create and point you to an FAQ so that you don't have to repeat the same answers on different articles to the same issues that i always raise in response to exactly the same issues as they are raised in different articles.
but - to repeat it again, just one more time: i track in my head somewhere between eighty to one hundred separate free software projects.
your (plural) expertise is in 'coding'. if i make mistakes - and, just like you, i AM going to make mistakes, EDUCATE ME.
none of us are going to 'go away'.
therefore, work WITH me to 'get the right information' instead of saying 'please stop' i'm never going to stop.
so you can work *with* me to educate me on the *correct* things (by pointing out mistakes in my diary which is where they should be corrected - it's why i put them there in the first place, so that you and people like you can respond to them) BEFORE i end up putting them on articles.
so - remind me: why again are you blaming *me* for mistakes i make in articles, by ignoring me and maintaining silence by NOT educating me on mistakes in my diary???
*get* me the right information. i never forget anything, so *get* me the right information to spout forth at people.
Re: educate me, posted 26 May 2007 at 20:41 UTC by bi »
so - remind me: why again are you blaming *me* for mistakes i make in
articles, by ignoring me and maintaining silence by NOT educating me on
mistakes in my diary???
To wit, what you're saying is: "Yes, you're right, and I'm
wrong... but you're still wrong, and I'm still right! Bingo!" Holy cow...
do you really expect that to wash?
Look, it was you who acted like you knew all the answers and
everyone else was obviously wrong; it was you who made the choice
to repeatedly rush to the keyboard and type nonsense (instead of, say,
looking up the facts one more time beforehand; or just keeping quiet); and
suddenly all these choices made by you and you alone are someone
else's fault? Holy cow.
The correct answer is going to depend on how much power the deep sleep
states save on your platform, how long you end up staying in each state and
what effect on the user experience different levels of latency have.
That sounds... complicated. :|
Complicated, posted 27 May 2007 at 00:10 UTC by mjg59 »
Well, the most power-efficient thing for a computer to do is never to run any code whatsoever. The more latency you're willing to introduce, the better you can optimise things by tying userspace wakeups to other wakeups that are going to happen already. For the most part, aiming for a system total of under 10 wakeups a second is probably pretty reasonable.
The more latency you're willing to introduce, the better you can optimise things by tying userspace wakeups to other wakeups that are going to happen already. For the most part, aiming for a system total of under 10 wakeups a second is probably pretty reasonable.
a real-time operating system, which is designed from the ground up usually with absolute cast-iron-guaranteed response times to nanosecond accuracy (these days) is (in the opinion of several people more experienced than i am, which anyone wishing to find out and confirm can do so if they so wish) the better vehicle for such matters to be addressed.
that having been said: i applaud your efforts, mjg59, in fighting against / working with an inappropriate design, and for working well with "what is de-facto not your decision".
(lkcl, did you actually look up the facts one more time just before you posted that?)
(Or maybe this'll just be another case of "oh shucks... I'm wrong again. But it's all your fault!")
the concept of 'try to make two things happen at once' doesn't really fly, because it imposes a high degree of inter-coordination between applications, kernel events etc.
which is purely impossible.
it would be much better to have the timer / scheduler API be able to specify the accuracy as an optional argument to the wakeup event.
in that way, the operating system has a way to decide at the level
at which it is important to make the decision.
and the user applications can remain simple. a low-priority event such as 'check email' would be on a +/- accuracy of something approaching 100% of the checking cycle [actually, you would need separate + accuracy and a separate - accuracy argument, to avoid problems of the wakeup time being scheduled repeatedly far too early]
not only that but the operating system can decide 'oh i can push the
boundaries of these wakeup events in order to achieve longer sleep times and therefore deeper sleep states'.
and if an external interrupt (e.g. phone call, button press) wakes up the device, and it can return to sleep again having created a new timer event, then those boundaries can be re-evaluated.
To a large extent, that's what Linux now does. I'm not at all clear on how improved latency guarantees would help here - they tend to be associated with increased overhead, which would also clearly result in you spending more time in high power states. Right now most of the interesting power management work (that is, power management in the context of having a useful broadly general purpose device rather than power management in the context of heavily embedded hardware) is being done on Linux (Symbian is way behind here - I'm not sure what Microsoft are up to), but if you have any pointers to people actually implementing anything relevant on realtime or microkernel platforms I'd certainly be interested to see how they're actually attempting to use their advantages.
In any case, I'll be at Debconf next week if you want to discuss this.
Impressive, posted 26 Jul 2007 at 18:08 UTC by slamb »
I'm trying to leave a Linux VM always running on my OS X laptop for development, so this is suddenly quite important for me. These tools are great, and going tickless greatly improved my battery life. Unfortunately, it seems like the VMware Tools generate about 24 wakeups/sec in total. The entire rest of the system is only generating about 9 wakeups/sec, so this is an impressive show from the Linux community and quite a sad one from proprietary software vendors. Parallels was even worse - they don't have APIC or HPET hardware emulation, so I was unable to run tickless Linux.
Of the timers that are firing, most seem to be related to expiration stuff in the kernel.
neigh_periodic_timer is a good example. If I'm reading this code right, it fires every
(net.FAMILY.neigh.INTERFACE.base_reachable_time_ms +- 50%) / 2 / hash_buckets milliseconds
to look through a single hash bucket for stale entries (they become stale 60 seconds after
update). (On my system, that turns out to be once every second or two.) It does round_jiffies() if
the interval is over a second.
In other words, it fires about once a second to see if any of my three ARP entries have been
inactive for a minute.
Is there any particular reason the kernel has these mostly-fixed timer intervals rather than
firing when the actual work needs to be done? I.e., why not maintain an index of the ARP entries
sorted by when they become stale? It could then suppress firing a timer until after at least one
entry is actually stale. Is there some reason it would be expensive to maintain that structure? Or
is it just that no one has bothered since minimizing wakeups became a priority?
ARP, posted 21 Aug 2007 at 20:46 UTC by ncm »
I don't see any need for timers on the ARP cache. The only time you
care what's in it is when you do a lookup, and you can discard any stale
ones you find.