|
Video Archive Format Details |
Introduction
COOL.STF has contracted with the Internet Archive to record 20 television channels in digital format for a year or more. The incoming video (either in digital or analog formats) is brought down to composite or S-Video (Y/C) formats and then re-encoded using FutureTel NS320 video encoders. Each archiving system runs four encoders and buffers the generated MPEG data using two hard drives. When all four of the 1GB buffers on a hard drive fills up the system switches recoding to the other hard drive and a background process then transfers the 4 x 1GB files onto a Breece Hill Q215 DLT 7000 tape robot.
The system also uses a couple of Nokia 9600 satellite receivers with special software to download the EPG (Electronic Program Guide) from the DISH Network and ExpressVu DBS services. This data (or pseudo generated events for channels that are not carried on DISH Network or ExpressVu) is used as the basis for the EPG that indexes the data on the tapes produced.
MPEG-2 Format
The MPEG-2 specification allows for many different recording formats. These are the settings we have chosen for our encoders:
| Video | MPEG-2 |
| Video Resolution |
2/3 D1 480x480 for NTSC 480x576 for PAL |
| Video Chroma Resolution | 4:2:0 standard |
| Video Bitrate | 3,000,000 bps |
| Audio | MPEG-1 Level 2 |
| Audio Sampling Rate | 48KHz stereo |
| Audio Bitrate | 224,000 bps |
| Multiplex | Program Stream |
| Multiplex Bitrate | Approx. 3.275 Mbps |
| PES Packet Size | 1024 |
| Pack Size | One PES packet per pack |
MPEG-2 Decoder Compatibility
We have succesfully tested extracted video with the following playback devices/programs. Generally speaking, for software playback of MPEG-2 video, a Pentium II class Processor running at 333MHz or higher is required.
| Product | Version/OS | Results |
| Sigma Designs Hollywood Plus (PCI Card) |
1.81 Windows 98 |
Almost perfect - doesn't interpolate the picture up to full D1 when in full-screen mode. |
| Sigma Designs NetStream 2 (PCI Card) |
4.60 (1.60a) Windows NT 4 SP6 |
Perfect |
| Sigma Designs NetStream 2000 (PCI Card) |
1.0 build 125 Windows NT SP6 & Windows 2000 |
Perfect |
| Optibase VideoPlex Express (PCI Card) |
1.2 Windows NT SP6 & Windows 2000 |
Perfect |
| Stradis SDM275 (PCI Card) |
2.01.007 Windows NT SP6 & Windows 2000 |
Perfect |
| PowerDVD (Software DVD Player) |
2.5 Windows 98 |
Perfect |
| MGI Soft DVD MAX (Software DVD Player) |
3.32.03 Windows 98 |
Perfect (but must use a supported AGP graphics card to use this product) |
| MediaMatics DVD Express (Software DVD Player - OEM Product) |
5.00.00.5.6.1 Windows 98 |
Perfect |
| Xing DVD Player (Software DVD Player) |
1.61 Windows 98 |
Doesn't play. Irrelevent since this product has been withdrawn from the market |
| Ligos (Software DirectShow Filter) |
1.0 Windows 98 |
Occasionally crashes. Doesn't interpolate the picture up to full D1 resolution |
| ATI DVD Player (Software DVD Player) |
3.5 Windows 98 |
Perfect |
| Herosoft SDVD 2000 |
|
Works very well. Occasionally, you have to right click on the video window and select "Original Size" to get a picture. |
After a hardware decoder, our recommended player is PowerDVD since it offers very good picture quality and compatbility with many different PCs.
Tape Layout
Tapes are identified by a barcode label that's read by the tape loader. Tapes are written in raw format with fixed 64K blocks. Files are seperated by filemarks.
File 0 is a 64K block that contains a copy of the barcode label and a copy of this text in HTML format.
Files 1 through 32 contain MPEG-2 Program Stream Data. The size of each file is 1GB.
File 33 contains the raw event file for the particular recording system. This way, every tape contains not only the MPEG-2 data, but also the description of how to extract it and an index to the data on the tape (as well as other tapes depending on when the recording was made).
Since MPEG-2 data is already well compressed, compression is disabled on the drive. This results in 35 decimal GB of storage per tape or roughly 32.5 real GB.
Some DLT tapes yield slightly less than 32.5GB - this is due to excessive space being used to repeat tape blocks as a result of errors on the tape. For these tapes, as soon as we get an error, the tape is removed from the loader and the current file is then repeated as the first MPEG-2 file on the next tape loaded. These "bad" tapes are then restored to a seperate system, erased and then re-recorded. If the tape has 29 or more files already recorded, a copy of the EPG file is written and the tape is considered completed. If less than 29 files are on the tape, it's loaded back into a tape drive and filled to capacity like any normal tape.
Event File Layout
The event file is the key to decoding the system. It is in effect a program guide that lists all the events (or programs) that are contained in the archive. This file is tab delimited with a carriage-return & line-feed at the end of each line.
| Parameter | Description |
| event-id | A unique identifier for the event. This is a 28-bit value starting at zero. The top 4-bits indicate the archiving system that recorded the event. |
| start-date-time | The starting date and time of the event using Universal Time. |
| run-time | The running time of the event in hours and minutes |
| tape-id | The barcode ID of the tape containing the event. |
| file-number | The file number on the tape that contains the event. |
| byte-offset | The offset within the file at which the event starts. If an event spans files, the next file will use an offset of zero. |
| channel-name | The name of the channel that carried the event. |
| event-title | Name of the event. |
| event-description | Description of the event. |
The event file can be parsed to generate a web viewable program guide. It is expected that the data will be imported into a database, from there, all that's needed to retrieve an event from tape is the event-id.
Extracting Events as MPEG-2 Files
This is the algorithm that's used to extract an event and write it as an MPEG-2 Program Stream file. The key to finding an event is the event ID.
MPEG-2 Program Stream files need to start with a specific set of packets to ensure compatiblity between decoders. This is how to align the output so that the MPEG-2 files can be read correctly. This applies only to the first file within an event - data after the start of the event is inherently already aligned:
Note: due to some timing problems (clock synchronization related at this end), tapes recorded prior to July 1, 2000 will have an incorrect byte-offset in the EPG data. A simple workaround is to read the start of the file and compare the running-time in the MPEG-2 GOP header until it matches the time of the event (both are relative to UTC).
Samples
Sample raw event file
Sample EPG data in HTML format (01/08/2000)
Sample EPG data in HTML format (01/09/2000)
Sample EPG data in HTML format (01/10/2000)
Sample EPG data in HTML format (01/11/2000)