Teknik Informatika    
   
Daftar Isi
(Sebelumnya) Comparison of data modeling toolsComparison of DOS operating systems (Berikutnya)

Perbandingan -- data serialization formats

This is a comparison of data serialization formats, various ways to convert complex objects to sequences of bits. It does not include markup languages used exclusively as document file formats.

Contents

Overview

NameCreator/MaintainerBased onStandardized?SpecificationBinary?Human-readable?Supports references?eSchema/IDL?Standard APIs
ASN.1ISO, IEC, ITU-TN/AYesISO/IEC 8824; X.680 series of ITU-T RecommendationsYes
(BER, DER, PER, or custom via ECN)
Yes
(XER, GSER, or custom via ECN)
PartialfYes (built-in)N/A
BencodeBram Cohen (creator)
BitTorrent, Inc. (maintainer)
N/AYesPart of BitTorrent protocol specificationPartially
(numbers and delimiters are ASCII)
PartiallyNoNoNo
BSONMongoDBJSONYesBSON SpecificationYesNoNoNoNo
Candle MarkupHenry LuoXML, JSON, JavaFXYesCandle Markup ReferenceNoYesYes
(XPointer, XPath)
Yes
(Candle Pattern Reference)
Yes
(XQuery, XPath)
Comma-separated values (CSV)RFC author:
Yakov Shafranovich
N/APartial
(myriad informal variants used)
RFC 4180
(among others)
NoYesNoNoNo
D-Bus Message Protocolfreedesktop.orgN/AYesD-Bus SpecificationYesYes
(Type Signatures)
NoNoYes
(see D-Bus)
JSONDouglas CrockfordJavaScript syntaxYesRFC 4627No, but see BSONYesPartial
(JSONPath, JPath, JSPON, json:select())
Partial
(JSON Schema Proposal, Kwalify, Rx, Itemscript Schema)
Partial: Clarinet (like SAX), JSONQuery (like XQuery), JSONPath (like XPath)
MessagePackSadayuki FuruhashiJSON (loosely)YesMessagePack format specificationYesNoNoNoNo
NetstringsDan BernsteinN/AYesnetstrings.txtYesYesNoNoNo
OGDLRolf Veen?Yes1.0 Working draftYes
(Binary 1.0 Working draft)
YesYes
(Path 1.0 Working draft)
Yes
(Schema WD)
 
Property listNeXT (creator)
Apple (maintainer)
?PartialPublic DTD for XML formatYesaYesbNo?Cocoa, CoreFoundation, OpenStep, GnuStep
Protocol BuffersGoogleN/APartialDeveloper Guide: EncodingYesPartialdNoYes (built-in) 
S-expressionsInternet Draft author:
Ron Rivest
Lisp, NetstringsPartial
(largely de facto)
"S-Expressions" Internet DraftYes
("Canonical representation")
Yes
("Advanced transport representation")
NoNo 
SerealYves Orton, Steffen Müller et alN/AYesSereal SpecificationYesNoYesNoNo
Structured Data eXchange FormatsMax WildgrubeN/AYesRFC 3072YesNoNoNo 
ThriftFacebook (creator)
Apache (maintainer)
N/ANoOriginal whitepaperYesPartialcNoYes (built-in) 
eXternal Data RepresentationSun Microsystems (creator)
IETF (maintainer)
N/AYesRFC 4506YesNoYesYesYes
XMLW3CSGMLYesW3C Recommendations:
1.0 (Fifth Edition)
1.1 (Second Edition)
Partial
(Binary XML)
YesYes
(XPointer, XPath)
Yes
(XML schema)
Yes
(DOM, SAX, XQuery, XPath)
XML-RPCDave Winer[1]XML, SOAP[1]YesXML-RPC SpecificationNoYesNoNoNo
YAMLClark Evans, Ingy döt Net, and Oren Ben-KikiC, Java, Perl, Python, Ruby, Email, HTML, MIME, URI, XML, SAX, SOAP, JSON[2]YesVersion 1.2NoYesYesPartial
(Kwalify, Rx, built-in language type-defs)
No
  • a. ^ The current default format is binary.
  • b. ^ The "classic" format is plain text, and an XML format is also supported.
  • c. ^ Theoretically possible due to abstraction, but no implementation is included.
  • d. ^ The primary format is binary, but a text format is available.[3]
  • e. ^ Means that generic tools/libraries know how to encode, decode, and dereference a reference to another piece of data in the same document. A tool may require the IDL file, but no more. Excludes custom, non-standardized referencing techniques.
  • f. ^ ASN.1 does offer OIDs, a standard format for globally unique identifiers. However, there is no standard for "marking"/"tagging" an arbitrary piece of data in a document with an OID. There is also no standard format for locally unique identifiers within a document. Therefore, a generic ASN.1 tool/library can not automatically encode/decode/resolve references within a document without help from custom-written program code.

Syntax comparison of human-readable formats

FormatNullBoolean trueBoolean falseIntegerFloating-pointStringArrayAssociative array/Object
ASN.1
(XML Encoding Rules)
<foo /><foo>true</foo><foo>false</foo><foo>685230</foo><foo>6.8523015e+5</foo><foo>A to Z</foo>
<SeqOfUnrelatedDatatypes> <isMarried>true</isMarried&g t; <hobby /> <velocity>-42.1e7</velocity& gt; <bookname>A to Z</bookname> <bookname>We said, "no".</bookname></SeqOfUnrel atedDatatypes>
An object (the key is a field name):
<person> <isMarried>true</isMarried&g t; <hobby /> <height>1.85</height> <name>Bob Peterson</name></person>

A data mapping (the key is a data value):

<competition> <measurement> <name>John</name> <height>3.14</height> </measurement> <measurement> <name>Jane</name> <height>2.718</height> </measurement></competition& gt;

a

Candle Markup(), ""truefalse685230
-685230
6.8523015e+5"A to Z"
"""
A
to
Z
"""
(true, (), -42.1e7, "A to Z")
_{%342=true A%20to%20Z=(1, 2, 3)}
or
_{  _{key=42 value=true}  _{key="A to Z" value=(1, 2, 3)}}
CSVbnulla
(or an empty element in the row)a
1a
truea
0a
falsea
685230
-685230a
6.8523015e+5aA to Z
"We said, ""no""."
true,,-42.1e7,"A to Z"
42,1A to Z,1,2,3
Netstringsc0:,a
4:null,a
1:1,a
4:true,a
1:0,a
5:false,a
6:685230,a9:6.8523e+5,a6:A to Z,29:4:true,0:,7:-42.1e7,6:A to Z,,41:9:2:42,1:1,,25:6:A to Z,12:1:1,1:2,1:3,,,,a
JSONnulltruefalse685230
-685230
6.8523015e+5"A to Z"[true, null, -42.1e7, "A to Z"]{"42": true, "A to Z": [1, 2, 3]}
OGDL[verification needed]nullatrueafalsea685230a6.8523015e+5a"A to Z"
'A to Z'
NoSpaces
truenull-42.1e7"A to Z"

(true, null, -42.1e7, "A to Z")

42  true"A to Z"  1  2  3
42  true"A to Z", (1, 2, 3)
Property list
(plain text format)[4]
N/A<*BY><*BN><*I685230><*R6.8523015e+5>"A to Z"( <*BY>, <*R-42.1e7>, "A to Z" )
{ "42" = <*BY> "A to Z" = ( <*I1>, <*I2>, <*I3> );}
Property list
(XML format)[5][6]
N/A<true /><false /><integer>685230</integer><real>6.8523015e+5</real><string>A to Z</string>
<array> <true /> <real>-42.1e7</real> <string>A to Z</string></array>
<dict> <key>42</key> <true /> <key>A to Z</key> <array> <integer>1</integer> <integer>2</integer> <integer>3</integer> </array></dict>
S-expressionsNIL
nil
T
#te
true
NIL
#fe
false
6852306.8523015e+5abc
"abc"
#616263#
3:abc
{MzphYmM=}
|YWJj|
(T NIL -42.1e7 "A to Z")((42 T) ("A to Z" (1 2 3)))
YAML~
null
Null
NULL[7]
y
Y
yes
Yes
YES
on
On
ON
true
True
TRUE[8]
n
N
no
No
NO
off
Off
OFF
false
False
FALSE[8]
685230
+685_230
-685230
02472256
0x_0A_74_AE
0b1010_0111_0100_1010_1110
190:20:30[9]
6.8523015e+5
685.230_15e+03
685_230.15
190:20:30.15
.inf
-.inf
.Inf
.INF
.NaN
.nan
.NAN[10]
A to Z
"A to Z"
'A to Z'
[y, ~, -42.1e7, "A to Z"]
- y-- -42.1e7- A to Z
{"John":3.14, "Jane":2.718}
42: yA to Z: [1, 2, 3]
XMLd<null />a<boolean val="true"/>a

<true />a

<boolean val="false"/>a

<false />a

<integer>685230</integer>a<float>6.8523015e+5</float&g t;aA to Za
<array>  <element type="boolean">true</element> ;  <element type="null"/>  <element type="float">-42.1e7</element&g t;  <element type="string">A to Z</element></array>
a
<associative-array>  <entry> <key type="integer">42</key> <value type="boolean">true</value>  </entry>  <entry> <key type="string">A to Z</key> <value>  <array> <element type="integer" val="1"/> <element type="integer" val="2"/> <element type="integer" val="3"/>  </array> </value>  </entry></associative-array& gt;
XML-RPC <value><boolean>1</boo lean></value><value><boolean>0</boo lean></value><value><int>685230</in t></value><value><double>6.8523015e +5</double></value><value><string>A to Z</string></value>
<value><array>  <data>  <value><boolean>1</boo lean></value>  <value><double>-42.1e7< ;/double></value>  <value><string>A to Z</string></value>  </data>  </array></value>
<value><struct>  <member> <name>42</name> <value><boolean>1</boo lean></value> </member>  <member> <name>A to Z</name> <value>  <array> <data>  <value><int>1</int> </value>  <value><int>2</int> </value>  <value><int>3</int> </value>  </data> </array>  </value> </member></struct>
  • a. ^ One possible encoding; the specification document does not specifically give an encoding for this datatype.
  • b. ^ The RFC CSV specification only deals with delimiters, newlines, and quote characters; it does not directly deal with serializing programming data structures.
  • c. ^ The netstrings specification only deals with nested byte strings; anything else is outside the scope of the specification.
  • d. ^ XML in and of itself is not a data serialization language, but many data serialization formats have been derived from it; as such, there are many different ways, in addition to those shown, to serialize programming data structures into XML.
  • e. ^ This syntax is not compatible with the Internet-Draft, but is used by some dialects of Lisp.

Perbandingan -- binary formats

FormatNullBooleansIntegerFloating-pointStringArrayAssociative array/Object
ASN.1
(BER or PER encoding)
NULL typeBOOLEAN; BER as 1 byte in binary formINTEGER; variable length big-endian binary representation up to 2^2^1024 bitsREAL; representation as IEEE double or as three integers (mantissa, base, exponent)Multiple valid types (VisibleString, PrintableString, GeneralString, UniversalString, UTF8String)data specifications SET OF (unordered) and SEQUENCE OF (guaranteed order)user definable type
BSON[11]Null type - 0 bytes for valueTrue: one byte \x01
False: \x00
int32: 32-bit little-endian 2's complement or int64: 64-bit little-endian 2's complementdouble: little-endian binary64UTF-8 encoded, preceded by int32 encoded string length in bytesBSON embedded document with numeric keysBSON embedded document
MessagePack\xc0True: \xc3
False: \xc2
Single byte "fixnum" (values -32..127)

or typecode (one byte) + big-endian (u)int8/16/32/64

Typecode (one byte) + IEEE single/doubleAs "fixraw" (single-byte prefix + up to 31 raw bytes)

or typecode (one byte) + 2-4 bytes length + raw bytes

As "fixarray" (single-byte prefix + up to 15 array items)

or typecode (one byte) + 2-4 bytes length + array items

As "fixmap" (single-byte prefix + up to 15 key-value pairs)

or typecode (one byte) + 2-4 bytes length + key-value pairs

Netstrings0:,True: 1:1,

False: 1:0,

     
OGDL Binary       
Property list
(binary format)
       
Protocol Buffers[12]  Variable encoding length signed 32-bit: varint encoding of "ZigZag"-encoded value (n << 1) XOR (n >> 31)

Variable encoding length signed 64-bit: varint encoding of "ZigZag"-encoded (n << 1) XOR (n >> 63)
Constant encoding length 32-bit: 32 bits in little-endian 2's complement
Constant encoding length 64-bit: 64 bits in little-endian 2's complement

floats: little-endian binary32

doubles: little-endian binary64

UTF-8 encoded, preceded by varint-encoded integer length of string in bytesRepeated value with the same tagN/A
Sereal0x25True: 0x3b
False: 0x3a
Single byte POS/NEG (values -16..15)

or typecode (one byte) + "varint" encoded variable length integer or typecode (one byte) + "zigzag" encoded variable length integer

Typecode (one byte) + IEEE single/double/quadAs "SHORT_BINARY" (single-byte prefix + up to 31 raw bytes)

or typecode (one byte, including boolean UTF8-encoding flag) + "varint" encoded length + raw bytes

As "ARRAYREF" (single-byte prefix + up to 15 array items)

or typecode (one byte) + "varint" encoded length + array items

As "HASHREF" (single-byte prefix + up to 15 key-value pairs)

or typecode (one byte) + "varint" encoded length + key-value pairs. Distinguishes hashmaps from objects / class instances.

Thrift       
Structured Data eXchange Formats (SDXF)  big-endian signed 24bit or 32bit integerbig-endian IEEE doubleeither UTF-8 or ISO 8859-1 encodedlist of elements with identical ID and size, preceded by array header with int16 lengthchunks can contain other chunks to arbitrary depth

See also

References

External links

(Sebelumnya) Comparison of data modeling toolsComparison of DOS operating systems (Berikutnya)