Last week my camera failed during recording of my daughter Christmas play. I used my old Kodak due to the low light conditions as it has a bigger CCD sensor than the other Samsung video recorder I have.
Following is the story of trying to recover the recording and lessons learned along the way.
First once I got home I put the SD Card inside the computer to see the level of the damage and guess what; The SD card showed a single MOV file which was expected, as I emptied the card just before the recording, however the files size was 0 bytes and could not be copied over.
Next I checked to see the disk properties which showed approximately 800MB in use.
The fact that I emptied the card and that it showed usage gave me some hopes on recovering the video, so the next step would be to start recovering the file from the card. Before starting to mess with it I did a backup of the card using a port of the very useful Disk Dupe tool from Linux. You can get the tool here: http://www.chrysocome.net/dd
Using the tool I saved the whole content of the SD card to a file which I used to extract the video file.
The next few minutes were spend trying to run different tools like scandisk and other file recovery tools to see if I can recover the file, but with no such luck. However some of the tools were able to recover two old videos and some picture files that were located in the free space.
Now with a backup file of the SD Card and few information points like: the partition format being FAT16, and that the expected file size being around 800MB, and also that by cleaning the card before recording most likely the file is at the beginning and all the file parts are following in order I needed to locate and extract the MOV file.
I have recovered files from Disk Partitions manually before and if you want to learn more about the FAT tables here are some links:
First I needed to learn more about the MOV file format. So I have put another empty SD card inside the camera and recorded a movie for 1 minute then another one for 30 minutes to use as a baseline. This was to make sure the videos are recorded with the exact same settings. If your camera is unusable you can use any old files already recorded by the same camera and not from any other one.
Next I fire up the browser and start learning about the MOV file format which is documented by Apple here:
and other web links like:
In summary the file is composed of blocks/Atoms with a header and data payload. The header would contain an identifier and the size of the data it contains and also the data could be made of other blocks/Atoms as the format is hierarchical. Based on that specifications fired up my Visual Studio and wrote a file parser to see what Atoms are used by the two sample files I saved.
Using the self-created tool found that both movies contain three Atoms in the following order: SKIP, MDAT, and MOOV
The documentation says that SKIP is to be skipped and it contained some information on the camera and the KODAK company in other words quick useless for my needs so far but proved invaluable during the file extraction. MDAT however contains all the raw VIDEO and AUDIO frames for the movie. The MOOV atom would be composed of other Atoms that would help in describing the video and audio frames inside the MDAT Atom and also attach metadata to the video.
Good, so now I know that I am looking for those three Atoms IDs inside the SD Card saved file. So I fired the Visual Studio again and build a tool to extract locations of those Atoms and their content. Interesting enough was able to recover the two movies from the empty space the same like the other recovery tools and found a SKIP Atom somewhere at the beginning. I guess that I forgot to mention that before trying this approach I also tried to read the FAT and Directory structure to see if I can identify my file whereabouts from them but with no luck as the structure was corrupted by the camera failure.
So now I know the file must start where the first SKIP Atom was found but had no clue where would end. So it would be safe to assume that should finish before the next SKIP was found and I saved that block to a separate file for further analysis and recovery. I ended up with a 830MB file which is close enough to the allocated space reported above.
Investigating the new file found that the SKIP Atom was intact and exactly the same like the one in the two base files, however the MDAT and MOOV tags were nowhere to be found. In the place where the MDAT header was supposed to be the space was filled with zeroes. My guess is that would have been filled later once the size would have been known.
Next I tried to identify if there is any chance to identify the MOOV atom by looking for any tags that would be used inside it and got no luck. This is due that that this Atom is the last one written even if it is the most important one (I will explain that later).
In order to make the file usable in some way I did created a MDAT Atom header manually using a hex editor using the file information that would match the remaining size then appended at the end of the file the MOOV Atom extracted as is from the 30min baseline video.
Let’s see what I ended up with. Played this files using VLC video Player and voilà the file started to play with distorted video and white noise instead of audio and from time to time some kind of real audio could be heard for less than a second. This tells me that the raw data is there and just needed to be interpreted "correctly", so I went to the next level understanding the data inside the MDAT Atom.
From Apple’s documentation found that MDAT Atom is nothing else than a header with any binary payload in any format, however the MOOV atom is used to process data inside the MDAT atom so I went learning more about the MOOV atom. As you remember I just copied it from the baseline video with no modifications whatsoever.
Using the MP4Parser ISO Viewer (https://code.google.com/p/mp4parser/ ) tool investigated the data inside MOOV and extracted the following info: resolution, frame rate, video format, audio encoding, and audio sample rate.
In short the MOOV contains two tracks and their respective info and data on how to extract it from MDAT Atom. This MOOV atom is from the other file so the information inside it on how to extract the video and audio frames is not very useful to me so far. So, how could the video be played if the data to extract it is wrong? From the data info extracted I have found that both video and audio frames are not constant in size which means that there must be something inside the MDAT Atom to at least identify the video frames and hopefully the audio ones if lucky.
Went again deeper into the next level and learned more about the MP4 format. As most of the first hand specifications ISO 14496 are not available for free I had to resort to second hand information some of which is listed in here:
In short for our purpose so far any video frame starts with the following 3 byte signature: 0x00 0x00 0x01 known as Start Code
As the audio and video frames are interleaved the block of data between two Start Codes contains either one video frame or one video frame + one audio frame.
Firing again Visual Studio and build a small tool to extract a certain size block from before each Start Code to identify which ones are audio. The block Size I choose was 1600 audio samples. This value was extracted from the MOOV audio track sample information Atoms.
And now is time to learn about the audio format which from the MOOV Atom seems to be coded as u-Law, Stereo, 16Khz sample rate, and 16 bit sample size. The u-Law compression allows for a 2:1 compression rate by reducing a 16 bit sample to a 8 bit encoded number.
Some links with information on the u-Law encoding are provided bellow:
http://en.wikipedia.org/wiki/%CE%9C-law_algorithm (once the page loads to see the text you have to select all: CTRL+A)
What it boils down is that decoding u-Law is very simple and all you need is a lookup table.
static short MuLawDecompressTable =
-11900,-11388,-10876,-10364, -9852, -9340, -8828, -8316,
-7932, -7676, -7420, -7164, -6908, -6652, -6396, -6140,
-5884, -5628, -5372, -5116, -4860, -4604, -4348, -4092,
-3900, -3772, -3644, -3516, -3388, -3260, -3132, -3004,
-2876, -2748, -2620, -2492, -2364, -2236, -2108, -1980,
-1884, -1820, -1756, -1692, -1628, -1564, -1500, -1436,
-1372, -1308, -1244, -1180, -1116, -1052, -988, -924,
-876, -844, -812, -780, -748, -716, -684, -652,
-620, -588, -556, -524, -492, -460, -428, -396,
-372, -356, -340, -324, -308, -292, -276, -260,
-244, -228, -212, -196, -180, -164, -148, -132,
-120, -112, -104, -96, -88, -80, -72, -64,
-56, -48, -40, -32, -24, -16, -8, -1,
32124, 31100, 30076, 29052, 28028, 27004, 25980, 24956,
23932, 22908, 21884, 20860, 19836, 18812, 17788, 16764,
15996, 15484, 14972, 14460, 13948, 13436, 12924, 12412,
11900, 11388, 10876, 10364, 9852, 9340, 8828, 8316,
7932, 7676, 7420, 7164, 6908, 6652, 6396, 6140,
5884, 5628, 5372, 5116, 4860, 4604, 4348, 4092,
3900, 3772, 3644, 3516, 3388, 3260, 3132, 3004,
2876, 2748, 2620, 2492, 2364, 2236, 2108, 1980,
1884, 1820, 1756, 1692, 1628, 1564, 1500, 1436,
1372, 1308, 1244, 1180, 1116, 1052, 988, 924,
876, 844, 812, 780, 748, 716, 684, 652,
620, 588, 556, 524, 492, 460, 428, 396,
372, 356, 340, 324, 308, 292, 276, 260,
244, 228, 212, 196, 180, 164, 148, 132,
120, 112, 104, 96, 88, 80, 72, 64,
56, 48, 40, 32, 24, 16, 8, 0
Also learned how to create wav files using the decoded PCM format so I can listen to the extracted audio. The links with info on the wav file format are provided bellow:
Now is time to start Visual Studio again an create a visual tool to see and hear the possible "sound" blocks saved before.
The length of the sound per block is under the 1/10 of a second, but you will be amazed of the best computer (our brain) capabilities of identifying random data versus real audio 🙂
The ear will get you only so far and would only help identifying which blocks contain audio samples and which ones are pure white sounds aka part of the video track at the expenses of hearing a lot of noise very loud. To aid into finding approximately how to split the blocks themselves and mark the beginning of the sound (this can only be done approximately as there is no identifier or header before the audio frame) plotting the decoded PCM wave can help a lot. Following are some of the visual samples:
Interesting enough the second picture and all the noise from the video parts of the blocks seems to have a tendency of having many values decoded around the 0 value even if my expectations where to see more randomness. However that came as a aid in detecting where to visually split the file, especially in the many occasions where the line between video and audio is not very clear (Ex: loud high frequency audio).
Some are easier to identify than others but with this method there is no way you can split the audio from video to the exact byte thus resulting in some pings (every 10ms or so) in the audio. That being said is still way better than what the media player was able to get using the "invalid" MOOV audio track. The resulted audio after trimming the beginning (remember the end is always correct as we know that the next bytes after that are the starting of the following video frame) and joining the resulted PCM samples into a wav file is good enough, especially after running some filters, and audio optimizations, and noise clean up using the Audacity tool (http://audacity.sourceforge.net/)
In my case it seems that mostly one every 3 frames contains audio samples. This means that 2/3 of the video frames could be reconstructed correctly in the MOOV track sample information and the rest could be reconstructed approximately given the marked splits. As the video is 30 minutes long and have extracted over 45K blocks with possible audio this will take a while to process.
My plan is to only process the first 5 minutes and see if the video quality improves.
Next steps would be to start processing the blocks that contain both video and audio deep enough to find the exact size of the video portion thus fully restoring the video. This is possible in theory however is impractical given the fact that many other parents have taped the event thus I can get an alternate video of the event.
As during my investigations did not find many articles about recovering video process. I hope that this post will help people understand the process of restoring a corrupted video from the camera.
I conclusion restoring is possible however if you know that somebody else has recorded the event the cheaper/easiest way is to ask that person for a copy.
In general restoration implies that some data has been lost resulting in loss of quality.
If you want to get corrupted video is easy: just power down the camera while recording. Mine had the button broken just enough that when somebody tripped on the camera support you know what happened next if you got this far into reading 🙂
PS: If there only was a header around the audio portion of the block thus making the restoration easier and maybe better. But hey, the person who invented the format never had to support it I guess by restoring corrupted data from his/her camera.