[Tizen General] Tizen Audio Stack

Patrick Shirkey pshirkey at boosthardware.com
Fri Sep 27 12:02:43 GMT 2013


On Fri, September 27, 2013 9:37 pm, Artem Bityutskiy wrote:
> On Fri, 2013-09-27 at 21:15 +1000, Patrick Shirkey wrote:
>> On Fri, September 27, 2013 8:24 pm, Artem Bityutskiy wrote:
>> > On Fri, 2013-09-27 at 19:18 +1000, Patrick Shirkey wrote:
>> >> On Fri, September 27, 2013 4:19 pm, Artem Bityutskiy wrote:
>> >> > On Wed, 2013-09-25 at 02:49 +1000, Patrick Shirkey wrote:
>> >> >> Hi,
>> >> >>
>> >> >> A quick update for those who are following this thread.
>> >> >>
>> >> >> We are tracing the audio latency when running a combination of
>> JACK
>> >> and
>> >> >> PA.
>> >> >>
>> >> >> We are currently looking at the PA Stream Buffer as a potential
>> >> >> bottleneck.
>> >> >>
>> >> >> During testing I have seen latency as low as 4ms round trip but
>> also
>> >> as
>> >> >> high as 1300ms and the results are not stable on my hda_intel
>> sound
>> >> >> device.
>> >> >
>> >> > I think you earlier said you are using an x68 desktop for testing.
>> >> What
>> >> > I'd try to do is to prevent deep C-states. Indeed, a package you
>> run
>> >> > pulseaudio/jack/other related process is able to enter a deep
>> C-state,
>> >> > there is an exit latency associated with it.
>> >> >
>> >> > To put the long story short, there is the /dev/cpu_dma_latency
>> file,
>> >> > where you can write the latency you can tolerate (in ms). The
>> kernel
>> >> > will translate this to the deepest C-state the processor can enter.
>> >> >
>> >> > You can write 0 there, which will mean that CPU won't ever enter
>> any
>> >> > C-state and will busy-loop when idle. Bad for power consumption.
>> But
>> >> you
>> >> > can just experiment if this helps to lessen the latency divination
>> >> that
>> >> > you observe.
>> >> >
>> >> > You can write a larger number, then CPU will enter C1 at least,
>> which
>> >> is
>> >> > already a lot better for PM.
>> >> >
>> >> > You can verify which C-states you hit with the 'turbostat' tool or
>> >> > powertop. The former comes, I think, from kernel-tools package in
>> >> > Fedora. Play with latency number and use them to check which
>> C-states
>> >> > this corresponds to.
>> >> >
>> >> > Ah, and there is a trick. You should open /dev/cpu_dma_latency,
>> write
>> >> > your latency (as ascii or binary, both are ok), and _do not close
>> it_.
>> >> > As soon as you close it, the kernel will switch to the default
>> latency
>> >> > constraint.
>> >> >
>> >> > Also, advanced drivers usually use the kernel PMQoS infrastructure
>> and
>> >> > instruct the system when they cannot tolerate high latency.
>> >> >
>> >> > When I do 'git grep PM_QOS_CPU_DMA_LATENCY' in the kernel, I do not
>> >> see
>> >> > the HDA driver doing this.
>> >> >
>> >> > Anyway, this may not solve the issue, but I'd suggest to try out if
>> it
>> >> > at least partially helps. And I am very interested to hear if it
>> does
>> >> or
>> >> > not, or may be you already tried this out.
>> >> >
>> >>
>> >>
>> >> I can't get turbostat  with apt on debian as it has been removed from
>> >> the
>> >> acpica-tools package.
>> >
>> > Ok. You can easily compile it yourself if you want. It is in the
>> kernel
>> > tree in tools/power/x86/turbostat/, where you just type 'make'.
>> >
>> > Anyway, the only reason I refer to this tool is that you can use it to
>> > check the C-state residency statistics, and how C-state residency is
>> > affected by /dev/cpu_dma_latency settings.
>> >
>> >> Using powertop I see these stats with /dev/cpu_dma_latency set to 0:
>> >
>> > Did you open the file, wrote 0, and kept the file open? Does not look
>> > like because I see you hit C3.
>> >
>> > I do not know how to do this from console, I wrote a custom scrip for
>> > this.
>> >
>> > I have a python script which can do this, I can send it to you, let me
>> > know in a private e-mail.
>> >
>> >> Idle
>> >>           Package   |            CPU 0
>> >> POLL        0.0%    | POLL        0.0%    0.0 ms
>> >> C1          0.3%    | C1          0.4%    0.1 ms
>> >> C2         17.8%    | C2         17.2%    0.2 ms
>> >> C3         13.1%    | C3         12.0%    0.1 ms
>> >
>> > See, you are hitting C2 and C3. C3 has the highest exit latency. But I
>> > do not know what would that be for your platform.
>> >
>>
>> I see results similar to this with powertop while using your script :
>> ./pmqos set cpu-latency 0
>
> Hmm, OK, I do not have comments now. I use IvyBridge, and there this
> works... You obviously have something older, a Westmere? Can you send me
> your /proc/cpuinfo ? I do not promise to come with suggestions, but will
> try to check few things.
>

Thanks for taking the time to provide this feedback. This laptop is about
5 years old. It has similar capabilities to modern mobile hardware in that
it is dual core powerful and has an onboard hda_intel audio chipset but
it's really just a test system for this process to attempt to find all the
potential bottlenecks.

Everything so far helps in the decision making process if it turns out
that PA needs fixing for the JACK + PA combination. The time spent on the
fixes might not be worth it in comparison to other options that have been
discussed.

It could be useful to run these tests on the medfield chips too to get a
wider dataset.



$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 CPU         T5600  @ 1.83GHz
stepping	: 6
microcode	: 0x48
cpu MHz		: 1830.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64
monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm tpr_shadow
bogomips	: 3657.30
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 CPU         T5600  @ 1.83GHz
stepping	: 6
microcode	: 0x48
cpu MHz		: 1830.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64
monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm tpr_shadow
bogomips	: 3657.45
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:


--
Patrick Shirkey
Boost Hardware Ltd


More information about the General mailing list