Hacker News new | past | comments | ask | show | jobs | submit login

do you have a description of the ".mov tree of structs format"



Here you go: http://atomicparsley.sourceforge.net/mpeg-4files.html

Nearly everything inherits from a basic struct that is 8 bytes per atom: { length of self + children, quasi-human readable 4 char code describing contents }

Practically speaking, in C/C++, you can stride by length and switch() on the ftype, using it to cast the read-in data to whatever class/struct you desire.

All of this while being so brutally dumb that you can rewrite it over and over again in about 10 lines of code in most languages.


This is pretty much IFF: http://en.wikipedia.org/wiki/Interchange_File_Format

I suspect that's where it originated.


Simple version (in pseudo-C): struct Atom { uint32 length; uchar type[4]; uchar data[length - 8]; };

The file is a single atom that has other atoms (and random parameters and such) in its data field. You end up with a big tree of atoms which can be parsed as needed. Super simple format -- like the parent, I use atom trees all the time for serialization.


Sounds almost like the IFF format, which was used for just about everything on the Amiga, and then later (with minor changes) as the basis to microsoft's RIFF, underlying wave files, .AVI and a lot of other formats.

IFF is: struct chunk { char tag[4]; int32 length; byte data[length]; byte padding[(2-(length%1))%2]; }

The padding is to two bytes; the tag uses ascii exclusively and no space (33-127), although every format I remember uses upper case + digits. The length does not include tag and the length field, not the padding. Microsoft, in a typical "we don't care" move adopted the spec except they specified little endian whereas IFF is originally big endian.

The entire file must be one complete chunk, and is thus limited to 2GB (signed integer length).

This format has been around (and at some point, dominated image storage with it's "ILBM" chunks, as well as other domains) since 1985 at least. https://en.wikipedia.org/wiki/Interchange_File_Format


JPEG uses a variation of IFF, which puts an additional checksum at the end of each chunk. A nice extension for detecting errors.


So basically just IFF/RIFF with fields exchanged?

See:

  http://en.wikipedia.org/wiki/Interchange_File_Format

  http://en.wikipedia.org/wiki/Resource_Interchange_File_Format


The big difference between the QT Atom structure and RIFF is that RIFF is a series of independent chunks (IIRC), whereas Atoms are a big tree. Structurally nearly identical, though.


Don't know about RIFF, but IFF files are/can be a big tree - the outer chunk must be one of FORM, LIST or CAT, and many chunk types contain additional chunks, so depending on the file you might get structures of arbitrary depth.


Is it not strange to call it an Atom? Atom is etymologically indivisible, when here we can have arbitrary structure.


Indeed it would be much more accurate to call it a Turtle, since it's Turtles all the way down.

(Under absolutely no circumstances should anyone actually do this)


Yeah, it is -- I've never found it to be a particularly great term, but it's what's used.


Atoms can be linked together to form an arbitrary structure. After all, a tree is a graph.


But once they are linked we call the linkage a molecule or compound.





Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: