HTTP H.264 from multiple cameras with sighttpd's shrecord
Today we'll look at how to use
sighttpd for
multi-camera H.264 video encoding and streaming.
This post is the last in a series about using hardware video encoding and
image conversion features of Renesas SH-Mobile on Linux. In earlier posts,
we described the way we do resource management in userspace (libuiomux),
use the hardware image manipulation features for colorspace conversion and
rescaling (libshveu); hardware encoding with libshcodecs; and simple
HTTP streaming from standard input with sighttpd:
Today's post ties all these together, showing how to use sighttpd's support
for integrated capture, video encoding and streaming. We'll also look at the
performance of the server under some light load, rather than the performance of
raw encoding to /dev/null that was done in the earlier libshcodecs article.
(Apologies to people reading this from Planet Haskell,
I'll have to whip up something with Happstack and Hogg to make up for the disruption ;-)
Configuration
The sighttpd.conf setup is fairly straightforward; we put the options for each
stream that we want to serve into an <SHRecord> block, including the
desired URL path and the location of the control file to use. The same control
file that are used for shcodecs-record can be used (the output
filename is ignored by sighttpd).
Listen 3000
<SHRecord>
Path "/video0/vga.264"
CtlFile "/usr/share/shcodecs-record/k264-v4l2-vga-stream.ctl"
Preview off
</SHRecord>
<SHRecord>
Path "/video0/cif.264"
CtlFile "/usr/share/shcodecs-record/k264-v4l2-cif-stream.ctl"
Preview off
</SHRecord>
<SHRecord>
Path "/video1/vga.264"
CtlFile "/usr/share/shcodecs-record/k264-v4l2-vga-stream2.ctl"
Preview off
</SHRecord>
<SHRecord>
Path "/video1/cif.264"
CtlFile "/usr/share/shcodecs-record/k264-v4l2-cif-stream2.ctl"
Preview off
</SHRecord>
I turn the on-screen Preview off because the Ecovec board I'm using has no
LCD panel and is instead plugged directly into an HDMI display, which
introduces a lot of bus contention. Disabling the on-screen preview improves
performance markedly.
This configuration on the host ecovec will make four H.264 streams appear at:
http://ecovec:3000/video0/vga.264,
http://ecovec:3000/video0/cif.264,
http://ecovec:3000/video1/vga.264, and
http://ecovec:3000/video1/cif.264.
These streams are derived from two camera sources, which here happen to be
/dev/video0 and /dev/video2 (sic) as specified in the control files.
Performance
Before any clients connect, sighttpd is continuously running the cameras,
colorspace conversion, rescaling and encoding all 4 streams. The CPU usage
is similar to that of
shcodecs-record encoding
4 streams,
ie. a little under 2% of this 500MHz SH7724 CPU:
top - 06:47:47 up 3:35, 2 users, load average: 0.17, 0.13, 0.24
Tasks: 50 total, 1 running, 49 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.9%us, 0.6%sy, 0.0%ni, 95.8%id, 1.6%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 248332k total, 211220k used, 37112k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 143752k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27787 root 20 0 93284 9.8m 1396 S 1.6 4.0 0:01.72 sighttpd
27821 root 20 0 2976 1204 988 R 1.0 0.5 0:00.19 top
1 root 20 0 2372 708 620 S 0.0 0.3 0:01.46 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
I hacked up the following quick script on a locally connected Linux PC to create
400 stream connections (100 to each of the 4 video streams) and fire them off one
per second. The -m option to curl provides a maximum timeout for each connection,
which we use here to fetch 20s of video during each connection.
(If you know a similar option for httperf to tell it to receive only a specified
duration of a continuous HTTP stream with each connection, please leave a note in
the comments!)
#!/bin/sh
for i in `seq 1 100`; do
curl http://ecovec:3000/video0/vga.264 -o /dev/null -s -m 20 \
-w "$i vga0: HTTP %{http_code} , %{time_total}s %{size_download} bytes\n" >> benchmark.log &
sleep 1
curl http://ecovec:3000/video1/vga.264 -o /dev/null -s -m 20 \
-w "$i vga1: HTTP %{http_code} , %{time_total}s %{size_download} bytes\n" >> benchmark.log &
sleep 1
curl http://ecovec:3000/video0/cif.264 -o /dev/null -s -m 20 \
-w "$i cif0: HTTP %{http_code} , %{time_total}s %{size_download} bytes\n" >> benchmark.log &
sleep 1
curl http://ecovec:3000/video1/cif.264 -o /dev/null -s -m 20 \
-w "$i cif1: HTTP %{http_code} , %{time_total}s %{size_download} bytes\n" >> benchmark.log &
sleep 1
done
The middle section of the benchmark.log file produced (while there are 20 parallel connections)
looks like this:
52 vga1: HTTP 200 , 20.001s 475165 bytes
52 cif0: HTTP 200 , 20.004s 211838 bytes
52 cif1: HTTP 200 , 20.608s 353310 bytes
53 vga0: HTTP 200 , 20.024s 963123 bytes
53 vga1: HTTP 200 , 20.015s 568863 bytes
53 cif0: HTTP 200 , 20.032s 1172898 bytes
53 cif1: HTTP 200 , 20.012s 1004619 bytes
54 vga0: HTTP 200 , 20.039s 1269070 bytes
54 vga1: HTTP 200 , 20.068s 951508 bytes
54 cif0: HTTP 200 , 20.059s 1088203 bytes
and while that is running, top looks like this:
top - 08:30:54 up 5:18, 2 users, load average: 0.30, 1.28, 0.79
Tasks: 49 total, 1 running, 48 sleeping, 0 stopped, 0 zombie
Cpu(s): 12.0%us, 1.2%sy, 0.0%ni, 81.0%id, 3.6%wa, 1.2%hi, 0.9%si, 0.0%st
Mem: 248332k total, 210472k used, 37860k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 144124k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29162 root 20 0 130m 9.8m 1348 S 12.7 4.0 0:01.42 sighttpd
29169 conrad 20 0 2976 1204 988 R 1.3 0.5 0:00.15 top
1 root 20 0 2372 708 620 S 0.0 0.3 0:01.46 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
I'm not claiming that it can handle thousands of connections, but at least we
can be sure that an embedded camera system based on this will reliably provide
all the streams that you have asked it to capture and encode without dropouts.
The usual use-case for this is as an input to an HTTP stream repeater on a
larger server with a faster upstream connection, designed to handle a much higher
load.
The bigger picture
Stepping back, the point of this series of articles has been to demonstrate
that it is very easy use hardware acceleration with Linux: we can export complex
driver functionality to userspace, we can quickly develop layered applications,
and we can do this while still leaving enough CPU around for other (perhaps
unrelated) tasks.
Syndicated 2010-04-28 00:00:00 (Updated 2010-04-28 00:00:02) from Conrad Parker