본문 바로가기

개발관련/이것저것

RIFF File Format


 작년 초 자세한 기억은 나지 않지만 wav 파일 작업중에 필요한 정보를 넣으려고 태그를 분석

했던 기억이..

 태그에 reserved 비슷한 영역이 있어서 그 곳에 정보를 넣었던 것 같다.

 젠장.. 미리 정리 좀 했어야 했는데.

 일단 문서라도. 이것도 어디서 구했는지..--;;


 

General RIFF File Background

 

 

General RIFF description provided by

Robert Shuler <rlshuler@aol.com>

 

General RIFF File Format

 

       RIFF is a Windows file format for storing chunks of multi-media data, associated descriptions, formats, playlists, etc.  The Waveform Audio File Format (.WAV) description below provides a precise description of the data unique to .WAV files, but does not describe the RIFF file structure within which the .WAV data is stored, so I have added this section to describe general RIFF files.

 

       If you read the raw file data you will need to process the structures described in this section.  If you use RIFF access functions within windows, they will strip this information off and you will not see it.

 

RIFF Header

 

       A RIFF file has an 8-byte RIFF header, identifying the file, and giving the residual length after the header (i.e. file_length - 8):

 

             struct {    

                char  id[4];       // identifier string = "RIFF"

                DWORD len;         // remaining length after this header

             } riff_hdr;

 

       The riff_hdr is immediately followed by a 4-byte data type identifier.  For .WAV files this is "WAVE" as follows:

 

             char wave_id[4];    // WAVE file identifier = "WAVE"

 

RIFF Chunks

 

       The entire remainder of the RIFF file is "chunks".  Each chunk has an 8-byte chunk header identifying the type of chunk, and giving the length in bytes of the data following the chunk header, as follows:

 

             struct {            // CHUNK 8-byte header

                char  id[4];      // identifier, e.g. "fmt " or "data"

                DWORD len; // remaining chunk length after header

             } chunk_hdr;

                                 // data bytes follow chunk header

 

       This concludes the general RIFF file description.  The types of chunks to expect for .WAV files (unexpected chunks should be allowed for in processing RIFF files) and the format of the content data of each chunk type are described in the sections that follow.

 

 

 

 

 

 

RIFF WAVE (.WAV) file format

 

From: Rob Ryan <ST802200@brownvm.brown.edu>

Organization: Brown University

 

       I found the following lengthy excerpt in a document rmrtf.zrt (it is actually a .zip file) in the vendor/microsoft/multimedia subdirectory at the ftp.uu.net ftp site.  It is presumably beyond the scope (in terms of the amount of detail) of your document, but nevertheless, I thought that it may help you in including references to the Windows .WAV format in the future.

 

       Let me know if you have any questions/comments.  Again, thank you for your helpful summary.  Keep it up!

 

       The following is taken from RIFFMCI.RTF, "Multimedia Programming Interface and Data Specification v1.0", a Windows RTF (Rich Text Format) file contained in the .zip file, RMRTF.ZRT.  The original document is quite long and this constitutes pages 83-95 of the text format version (starting on roughly page 58 of the RTF version).  If you would like a PostScript version, let me know and I can make one up for you.

 

Waveform Audio File Format (WAVE)

 

       This section describes the Waveform format, which is used to represent digitized sound.

 

       The WAVE form is defined as follows. Programs must expect(and ignore) any unknown chunks encountered, as with all RIFF forms. However, <fmt-ck> must always occur before <wave-data>, and both of these chunks are mandatory in a WAVE file.<

 

WAVE-form> ->

       RIFF( 'WAVE'

       <fmt-ck>             // Format

       [<fact-ck>]         // Fact chunk

       [<cue-ck>]          // Cue points

       [<playlist-ck>]            // Playlist

       [<assoc-data-list>]        // Associated data list

       <wave-data>   )             // Wave data

 

WAVE chunks are described in the following sections.

 

WAVE Format Chunk

 

       The WAVE format chunk <fmt-ck> specifies the format of the <wave-data>. The <fmt-ck> is defined as follows:

 

<fmt-ck> ->   fmt( <common-fields> <format-specific-fields> )

 

<common-fields> ->

       struct

       {

             WORD wFormatTag;         // Format category

             WORD wChannels;          // Number of channels

             DWORDdwSamplesPerSec;    // Sampling rate

             DWORDdwAvgBytesPerSec;   // For buffer estimation

             WORD wBlockAlign;        // Data block size

       }

 

Common Fields Chunk

 

       The fields in the <common-fields> chunk are as follows:

 

       Field         Description

       wFormatTag          A number indicating the WAVE format category of

the file. The content of the <format-specific-fields> portion of the `fmt' chunk, and the interpretation of the waveform data,on this value. must register any new WAVE format categories. See ``Registering Multimedia Formats'' in Chapter 1, ``Overview of Multimedia,'' for information on registering WAVE format categories. ``Wave Format Categories,'' following this section, lists the currently defined WAVE format categories.

 

       wChannels           The number of channels represented in the

waveform data, such as 1 for mono or 2 for stereo.

 

       dwSamplesPerSec     The sampling rate (in samples per second)

at which each channel should be played.

 

       dwAvgBytesPerSec    The average number of bytes per second

at which the waveform data should be transferred. Playback software can estimate the buffer size using this value.

 

       wBlockAlign         The block alignment (in bytes) of the waveform

data.  Playback software needs to process a multiple of wBlockAlign bytes of data at a time, so the value of wBlockAlign can be used for buffer alignment.

 

Format Specific Fields Chunk

 

       The <format-specific-fields> consists of zero or more bytes of parameters. Which parameters occur depends on the WAVE format category-see the following section for details. Playback software should be written to allow for (and ignore) any unknown <format-specific-fields> parameters that occur at the end of this field.

 

WAVE Format Categories

 

       The format category of a WAVE file is specified by the value of the wFormatTag field of the `fmt' chunk. The representation of data in <wave-data>, and the content of the <format-specific-fields> of the `fmt' chunk, depend on the format category.

 

       The currently defined open non-proprietary WAVE format categories are as follows:

 

wFormatTag Value                 Format Category_

 

WAVE_FORMAT_PCM     (0x0001)      Microsoft Pulse Code Modulation (PCM)

 

       The following are the registered proprietary WAVE format categories:

 

wFormatTag Value                 Format Category_

 

FORMAT_MULAW (0x0101)     IBM mu-law format

IBM_FORMAT_ALAW     (0x0102)     IBM a-law format

IBM_FORMAT_ADPCM    (0x0103)     IBM AVC Adaptive Differential PCM format

 

Microsoft WAVE_FORMAT_PCM format

 

       The following sections describe the Microsoft WAVE_FORMAT_PCM format.  If the wFormatTag field of the <fmt-ck> is set to WAVE_FORMAT_PCM, then the waveform data consists of samples represented in pulse code modulation (PCM) format. For PCM waveform data, the <format-specific-fields> is defined as follows:

 

<PCM-format-specific> ->

       struct

       {

             WORD wBitsPerSample;       // Sample size

       }

 

       The wBitsPerSample field specifies the number of bits of data used to represent each sample of each channel. If there are multiple channels, the sample size is the same for each channel.

 

       For PCM data, the wAvgBytesPerSec field of the `fmt' chunk should be equal to the following formula rounded up to the next whole number:

 

                                    wBitsPerSample

       wChannels x wBitsPerSecond x --------------

                                          8

 

       The wBlockAlign field should be equal to the following formula, rounded to the next whole number:

 

                   wBitsPerSample

       wChannels x --------------

                         8

 

Data Packing for PCM WAVE Files

 

       In a single-channel WAVE file, samples are stored consecutively. For stereo WAVE files, channel 0 represents the left channel, and channel 1 represents the right channel. The speaker position mapping for more than two channels is currently undefined. In multiple-channel WAVE files, samples are interleaved.

 

       The following diagrams show the data packing for a 8-bit mono and stereo WAVE files:

 

Data Packing for 8-Bit Mono PCM:

 

             Sample 1     Sample 2     Sample 3     Sample 4

             ---------    ---------    ---------    ---------

             Channel 0    Channel 0    Channel 0    Channel 0

 

Data Packing for 8-Bit Stereo PCM:

 

                    Sample 1                   Sample 2

             ---------------------      ---------------------

             Channel 0    Channel 1    Channel 0    Channel 0

              (left)       (right)      (left)       (right)

 

       The following diagrams show the data packing for 16-bit mono and stereo WAVE files:

 

Data Packing for 16-Bit Mono PCM:

 

                    Sample 1                  Sample 2

             ----------------------     ----------------------

             Channel 0    Channel 0    Channel 0    Channel 0

             low-order    high-order   low-order    high-order

             byte          byte         byte         byte

 

Data Packing for 16-Bit Stereo PCM:

 

                                 Sample 1

             ---------------------------------------------

             Channel 0    Channel 0    Channel 1    Channel 1

             (left) (left) (right)      (right)

             low-order    high-order   low-order     high-order

             byte          byte          byte         byte

 

Data Format of the Samples

 

       Each sample is contained in an integer i. The size of i is the smallest number of bytes required to contain the specified sample size. The least significant byte is stored first. The bits that represent the sample amplitude are stored in the most significant bits of i, and the remaining bits are set to zero.

 

       For example, if the sample size (recorded in nBitsPerSample) is 12 bits, then each sample is stored in a two-byte integer. The least significant four bits of the first (least significant) byte is set to zero.       The data format and maximum and minimums values for PCM waveform samples of various sizes are as follows:

 

             SampleSize   DataFormat   Max.Value    MinimumValue

             One to        Unsigned     255 (0xFF)   0

             eight bits   integer

 

             Nine or      Signed Largest      Most negative

             more bits    integer i    positive     value of i

                                        value of i

 

       For example, the maximum, minimum, and midpoint values for 8-bit and 16-bit PCM waveform data are as follows:

 

             Format Max.Value    Min.Value    MidpointValue

             8-bit PCM    255 (0xFF)   0            128 (0x80)

             16-bit PCM   32767 -32768        0

                           (0x7FFF)     (-0x8000)

 

Examples of PCM WAVE Files

 

       Example of a PCM WAVE file with 11.025 kHz sampling rate, mono, 8 bits per sample:

 

             RIFF( 'WAVE' fmt(1, 1, 11025, 11025, 1, 8)

                                 data( <wave-data> ) )

 

       Example of a PCM WAVE file with 22.05 kHz sampling rate, stereo, 8 bits per sample:

 

             RIFF( 'WAVE' fmt(1, 2, 22050, 44100, 2, 8)

                                 data( <wave-data> ) )

 

       Example of a PCM WAVE file with 44.1 kHz sampling rate, mono, 20 bits per sample:

 

             RIFF( 'WAVE'     INFO(INAM("O Canada"Z))

                                 fmt(1, 1, 44100, 132300, 3, 20)

                                 data( <wave-data> ) )

 

Storage of WAVE Data

 

       The <wave-data> contains the waveform data. It is defined as follows:

 

       <wave-data> ->   { <data-ck> : <data-list> }

       <data-ck>   ->   data( <wave-data> )

       <wave-list> ->   LIST( 'wavl' { <data-ck> :    // Wave samples

                                 <silence-ck> }... ) // Silence

       <silence-ck> ->  slnt( <dwSamples:DWORD> )     // Count of

                                                            // silent samples

 

       Note:  The `slnt' chunk represents silence, not necessarily a repeated zero volume or baseline sample. In 16-bit PCM data, if the last sample value played before the silence section is a 10000, then if data is still output to the D to A converter, it must maintain the 10000 value. If a zero value is used, a click may be heard at the start and end of the silence section.  If play begins at a silence section, then a zero value might be used since no other information is available. A click might be created if the data following the silent section starts with a nonzero value.

 

FACT Chunk

 

       The <fact-ck> fact chunk stores important information about the contents of the WAVE file. This chunk is defined as follows:

 

       <fact-ck> -> fact( <dwFileSize:DWORD> ) // Number of samples

 

       The `fact'' chunk is required if the waveform data is contained in a `wavl'' LIST chunk and for all compressed audio formats. The chunk is not required for PCM files using the `data'' chunk format.

 

       The "fact" chunk will be expanded to include any other information required by future WAVE formats. Added fields will appear following the <dwFileSize> field. Applications can use the chunk size field to determine which fields are present.

 

Cue-Points Chunk

 

       The <cue-ck> cue-points chunk identifies a series of positions in the waveform data stream. The <cue-ck> is defined as follows:

 

       <cue-ck> ->   cue( <dwCuePoints:DWORD>   // Count of cue points

                           <cue-point>... )           // Cue-point table

       <cue-point> ->      struct

                           {

                                 DWORD  dwName;

                                 DWORD  dwPosition;

                                 FOURCC fccChunk;

                                 DWORD  dwChunkStart;

                                 DWORD  dwBlockStart;

                                 DWORD  dwSampleOffset;

                           }

 

       The <cue-point> fields are as follows:

 

       Field               Description

       dwName              Specifies the cue point name. Each

<cue-point> record must have a unique dwName field.

       dwPosition          Specifies the sample position of the cue

point.This is the sequential sample number within the play order. See ``Playlist Chunk,'' later in this document, for a discussion of the play order.

       fccChunk            Specifies the name or chunk ID of thechunk

containing the cue point.

       dwChunkStart Specifies the file position of the start of

the chunk containing the cue point. This is a byte offset relative to the start of the data section of the `wavl' LIST chunk.

       dwBlockStart Specifies the file position of the start of

the block containing the position. This is a byte offset relative to the start of the data section of the `wavl' LIST chunk.

       dwSampleOffset      Specifies the sample offset of the cuepoint

relative to the start of the block.

 

Examples of File Position Values

 

       The following table describes the <cue-point> field values for a WAVE file containing multiple `data' and `slnt' chunks enclosed in a `wavl' LIST chunk:

 

CuePointLoc. Field               Value

a `slnt'            fccChunk            FOURCC value `slnt'.

 

                    dwChunkStart        File position of the`slnt' chunk

relative to the start of the data section in the `wavl' LIST chunk.

 

                    dwBlockStart        File position of the datasection of

the `slnt' chunk relative to the start of the data section of the `wavl' LIST chunk.

                    dwSampleOffset      Sample position of the cuepoint

relative to the start of the `slnt' chunk.

 

In a PCM            fccChunk            FOURCC value `data'.

`data' chunk

                    dwChunkStart        File position of the`data' chunk

relative to the start of the data section in the `wavl' LIST chunk.

 

                    dwBlockStart        File position of the cuepoint

relative to the start of the data section of the `wavl' LIST chunk.

 

                    dwSampleOffset      Zero value.

 

In a                fccChunk      FOURCC value `data'.

compressed

`data' chunk

                    dwChunkStart        File position of the startof the

`data' chunk relative to the start of the data section of the `wavl' LIST chunk.

 

                    dwBlockStart File position of theenclosing block

relative to the start of the data section of the `wavl' LIST chunk. The software can begin the decompression at this point.

 

                    dwSampleOffset      Sample position of the cuepoint

relative to the start of the block.

 

       The following table describes the <cue-point> field values for a WAVE file containing a single `data' chunk:

 

CuePointLoc. Field               Value

Within PCM          fccChunk            FOURCC value `data'.

data

                    dwChunkStart Zero value.

 

                    dwBlockStart Zero value.

 

                    dwSampleOffset      Sample position of the cuepoint

relative to the start of the `data' chunk.

 

In a                fccChunk            FOURCC value `data'.

compressed

`data' chunk

                    dwChunkStart Zero value.

 

                    dwBlockStart File position of theenclosing block

relative to the start of the `data' chunk. The software can begin the decompression at this point.

 

                    dwSampleOffset      Sample position of the cuepoint

relative to the start of the block.

 

Playlist Chunk

 

       The <playlist-ck> playlist chunk specifies a play order for a series of cue points. The <playlist-ck> is defined as follows:

 

<playlist-ck> -> plst(     <dwSegments:DWORD>   // Count of play segments

                           <play-segment>... )  // Play-segment table

 

<play-segment> ->  struct {

                                 DWORD dwName;

                                 DWORD dwLength;

                                 DWORD dwLoops;

                              }

 

       The <play-segment> fields are as follows:

 

             Field               Description

             dwName              Specifies the cue point name. This value

must match one of the names listed in the <cue-ck> cue-point table.

 

             dwLength            Specifies the length of the section

in samples.

 

             dwLoops             Specifies the number of times to play

the section.

 

Associated Data Chunk

 

       The <assoc-data-list> associated data list provides the ability to attach information like labels to sections of the waveform data stream. The <assoc-data-list> is defined as follows:

 

<assoc-data-list> ->  LIST('adtl'

                                 <labl-ck>           // Label

                                 <note-ck>           // Note

                                 <ltxt-ck>           // Text with data length

                                 <file-ck> )   // Media file

 

<labl-ck> -> labl(  <dwName:DWORD> <data:ZSTR> )

 

<note-ck> -> note(  <dwName:DWORD> <data:ZSTR> )

 

<ltxt-ck> -> ltxt(  <dwName:DWORD>

                           <dwSampleLength:DWORD>

                           <dwPurpose:DWORD>

                           <wCountry:WORD>

                           <wLanguage:WORD>

                           <wDialect:WORD>

                           <wCodePage:WORD>

                           <data:BYTE>... )

 

<file-ck> -> file(  <dwName:DWORD>

                          <dwMedType:DWORD>

                           <fileData:BYTE>...)

 

Label and Note Information

 

       The `labl' and `note' chunks have similar fields. The `labl' chunk contains a label, or title, to associate with a cue point. The `note' chunk contains comment text for a cue point. The fields are as follows:

 

             Field               Description

             dwName       Specifies the cue point name.  This

value must match one of the names listed in the <cue-ck> cue-point                             table.

 

             data                Specifies a NULL-terminated string

containing a text label (for the `labl' chunk) or comment text (for the                             `note' chunk).

 

Text with Data Length Information

 

       The `ltxt'' chunk contains text that is associated with a data segment of specific length. The chunk fields are as follows:

 

             Field               Description

             dwName       Specifies the cue point name.  This

value must match one of the names listed in the <cue-ck> cue-point                             table.

 

             dwSampleLength      Specifies the number of samples in the

segment of waveform data.

 

             dwPurpose           Specifies the type or purpose of the

text. For example, dwPurpose can specify a FOURCC code like `scrp' for                             script text or `capt' for close-caption text.

 

             wCountry            Specifies the country code for the

text. See ``Country Codes'' in Chapter 2, ``Resource Interchange File                             Format,'' for a current list of country codes.

 

             wLanguage,          Specify the language and dialect codes

             wDialect            for the text. See ``Language and Dialect

Codes'' in Chapter 2, ``Resource Interchange File Format,'' for a current list of language and dialect codes.

 

             wCodePage           Specifies the code page for the text.

 

Embedded File Information

 

       The `file' chunk contains information described in other file formats (for example, an `RDIB' file or an ASCII text file). The chunk fields are as follows:

 

             Field               Description

             dwName       Specifies the cue point name.  This value

must match one of the names listed in the <cue-ck> cue-point table.

 

             dwMedType           Specifies the file type contained in the

fileData field. If the fileData section contains a RIFF form, the dwMedType field is the same as the RIFF form type for the file.  This field can contain a zero value.

 

             fileData            Contains the media file.