Linus Torvalds writes: (Summary)
For example, say that you're some runtime that wants to use the percpu
thing for percpu counters - because you want to avoid cache ping-pong,
and you want to avoid per-thread allocation overhead (or per-thread
scaling for just summing up the counters) when you have potentially
tens of thousands of threads.
tens of thousands of threads.
Now, how does this runtime work *together* with
Now, how does this runtime work *together* with
- CPU hotplug adding new cpu's while you are running (and after you allocated your percpu areas)
allocated your percpu areas)
- libraries and system admins that limit - or extend - you to a certain set of CPUs
certain set of CPUs
- another library (like the malloc library) that wants to use the same interface for its percpu allocation queues.
same interface for its percpu allocation queues.
maybe all of this "just works", but I really want to see an existence proof.
tens of thousands of threads.
Now, how does this runtime work *together* with
Now, how does this runtime work *together* with
- CPU hotplug adding new cpu's while you are running (and after you allocated your percpu areas)
allocated your percpu areas)
- libraries and system admins that limit - or extend - you to a certain set of CPUs
certain set of CPUs
- another library (like the malloc library) that wants to use the same interface for its percpu allocation queues.
same interface for its percpu allocation queues.
maybe all of this "just works", but I really want to see an existence proof.