The Freenet Project

Freenet Metadata Spec

Author: Adam Langley, Eric Norige


Table of Contents

Abstract

This spec is for client metadata. The purpose of this spec is to provide convenient functionality for building web sites inside freenet, as well as more general description of inserted content.

Overview

Properly formatted metadata is composed of a VersionCommand followed by a list of Parts. A Part is a named list of "variable=value" pairs, followed by the string "EndPart" (or "End" for the last Part. Informally, the version command is a part whose name is "Version", and which has a numeric field called "Revision". Any metadata that doesn't fit neatly in the variable=value style (XML, etc.) can be included verbatim following the last Part.

For those who like grammars, here is a more formal definition:

metadata := VERSIONPART [part ...] lastpart REST
VERSIONPART := "Version\n" "Revision=1\n" ["Encoding=gzip\n"] "EndPart\n" 
     //Encoding is optional
part := "Document\n" [field ...] "EndPart\n"
lastpart := "Document\n" [field ...] "End\n"
field := KEY '=' VALUE '\n'
KEY := <string not containing either '\n' or '='> 
     //Heirarchical key name
VALUE := <string not containing '\n'> 
     //value of the associated key
REST := <arbitrary data of arbitrary length, not parsed by this parser> 
     //useful for XML metadata or other metadata not 
     //storable in the Info.* keyspace

Example 1. Abstract Metadata Format

Version
Revision=1
EndPart
Document
Key=value
EndPart
Document
Key1=value1
Key2=value2
End

Note

"//" in a URI is reserved for metadata processing. This means that the MSK@.....// format is gone. Document Names are the string which comes after the "//". Each Part should have a field with key "Name". A Part without a "Name" field should be assumed to have a Name of "". The Part whose Name matches the Document Name should be acted upon.

Example Name Processing

Assume the following metadata is inserted under KSK@metadata-test:
Version
Revision=1
EndPart
Document
Redirect.Target==KSK@sample.txt
EndPart
Document
Name=target1
DateRedirect.Target=KSK@blogoffoobar
Info.Format=text/html
EndPart
Document
Name=target2
Redirect.Target=KSK@ignored
Document
Name=target2
Redirect.Target=CHK@ignored
Redirect.Target=CHK@blahblah
Info.Format=audio/mp3
End

In this case, a request for KSK@metadata-test should return the raw metadata along with whatever data was inserted (usually an empty file is inserted with Control documents like this). It would be nice for the interface to indicate what names are possible, in a directory-listing sort of interface.

Requesting KSK@metadata-test// should invoke the redirect to =KSK@sample.txt, and return results identical to requesting =KSK@sample.txt directly. The reason for this is that the document name requested is "", and that matches the Name of the first (non-version) Part, which doesn't have a Name field, so is assumed to have name "".

Requesting KSK@metadata-test//target2 should activate the last redirect in the final Part and redirect to CHK@blahblah. Because there is informational metadata here and there could also be Info.* fields in CHK@blahblah, return both sets of metadata, allowing the redirect's metadata (the Format=audio/mp3) to take precedence over a Format field in the CHK's metadata.

As an aside, it is recommended not to have informational metadata in CHKs, as this reduces the likelihood of identical data resulting in the same CHK. The best place to put informational metadata is in the redirect.

Lastly, requesting KSK@metadata-test/target3 should return an error indicating that there is no Part matching that name, similar to an DataNotFound error.

Restatement of document processing algorithm

The logic for following Document names should be the following recursive(!) algorithm:

To request "BaseKey//Name":

  1. request BaseKey (recursively if necessary)
  2. if it's not a control document, stop and return a URI error
  3. act on the name:
    Name found:
    act on name, possibly requesting another key
    No Name, No default:
    stop and return a URI error (possibly listing valid names)
    No Name, default present:
    act on default. If it's a redirect to BaseKey2, request "BaseKey2//Name"

The order of keys in a Part is not important, and if the same Key appears twice in a part, only the last one's value is used.

Note


All numbers are base 16

Control Document Commands are denoted (CDC) and metadata commands as (MC)

There may only be one section with the same Name.

Part Spec


Redirect (CDC)



Redirect.Target=<URI:>

The Client should redirect to the given URI.

DateRedirect (CDC)



  [DateRedirect.Increment=<number: time-grain size in seconds, default=15180 (one day)>]
  [DateRedirect.Offset=<number: time-grain in seconds since unix epoch (January 1, 1970) to start increments, default = 0>]
  DateRedirect.Target=<URI:>


The client should take the current time (GMT) and work out the last
member of the series of times (offset, offset + i, offset + 2i, offset + 3i,
...) which occurred. The client then replaces the part of the URI after the final
slash (/string) with <DATE>-string where <DATE> is the hex encoded number of seconds
since the epoch.

Note


"freenet:" is speced otherwise someone could do something like "http://..."

In the case of KSKs, the human readable part is the whole key, so freenet:KSK@style becomes freenet:KSK@3b4cf86e-style

In the case of SSKs, the human readable part is the document name, and freenet:SSK@aabbccddee/style becomes freenet:SSK@aabbccddee/3b4cf86e-style


SplitFile (CDC)


note 12 Feb 2003: Splitfiles without FEC are outdated.



  SplitFile.Size=<hex file size>
  SplitFile.BlockCount=<hex no. of data blocks>
  SplitFile.CheckBlockCount= <hex. no of check blocks>
  SplitFile.Block.<n>=<URI>
  SplitFile.Graph.<x>= a,b,c...


The document is made up of a number of pieces, allowing swarming.

Note


thanks much to thelema, oierw, mjr and others for this


Size Required
This defines the final size of the original file.

BlockCount Required
This defines the number of pieces of data that there are.

CheckBlockCount Optional
This defines the number of check pieces that there are. If Omitted, should default to 0.

Block.<n> Required
These are the block URIs, most likely CHKs. These must be numbered 1 to BlockCount+CheckBlockCount. The first BlockCount blocks are the data blocks, and the next CheckBlockCount are the check blocks. From the above, a client can start a swarmed download of the file. Redundant splitting is optional, and information is below.

Graph.<x> Optional
For each check Block.<x>, there must be a Graph.<x> listing the data blocks that check block derives from. Graph entries for 1..BlockCount should not be given, but for BlockCount+1..BlockCount+CheckBlockCount must be given. A check block may also be derived from other check blocks, but only lower numbered ones.


Info (MC)


   Info.Format=<string: MIME-type>
   Info.Description=<string: freeform>

The Info.* namespace is reserved for Dublin Core metadata. Prepend "Info." to the keys you want to use to prevent collisions. See http://www.freenetproject.org/doc/infometadata.html for details.

Format is the proper place to put the document's mime type. Description is a Plain description of this data, not an abstract or TOC.


ExtInfo (MC)



  ExtInfo.Trailing=yes
  ExtInfo.URI=<URI>


If ExtInfo.Trailing is set to yes, the metadata for this file will include all data after the final "End" in the control document. If the ExtInfo.URI parameter exists, the contents of the URI pointed to should be included in the metadata for the current document.

Examples


Example 2. Pseudo Website
   Version
   Revision=1
   EndPart
   Document
   Redirect.Target=CHK@aabbccddee
   EndPart
   Document
   Name=split
   SplitFile.Size=102400
   SplitFile.BlockCount=3
   SplitFile.Block.1=freenet:CHK@aabbccddee1
   SplitFile.Block.2=freenet:CHK@aabbccddee2
   SplitFile.Block.3=freenet:CHK@aabbccddee3
   Info.Format=text/plain
   EndPart
   Document
   Name=date-redirect
   DateRedirect.Increment=93a80
   DateRedirect.Offset=a8c0
   DateRedirect.Target=SSK@aabbccddee/something
   End

This would declare a sort of website. Assume it is inserted under
freenet:SSK@aabbccddee/mysite. Accessing freenet:SSK@aabbccddee/mysite
or freenet:SSK@aabbccddee/mysite// would cause the first redirect
(without a Name) to be processed.

If freenet:SSK@aabbccddee/mysite//split were accessed the SplitFile
section would be processed, as would the Info section. This would
(hopefully) swarm a file, with some configurable concurrency. None of the CHKs
being swarmed need any metadata because it's included in the control document.

If freenet:SSK@aabbccddee/mysite//date-redirect were accessed the
DateRedirect section would be processed. This would redirect to some other
URI.

Example 3. TrailingInfo example
   Version
   Revision=1
   EndPart
   Document
   Redirect.Target=CHK@aabbccddee
   ExtInfo.Trailing=yes
   EndPart
   Document
   Name=doc1
   Redirect.Target=CHK@aabbccddee1
   Info.Format=text/plain
   ExtInfo.Trailing=yes
   EndPart
   Document
   Name=doc2
   DateRedirect.Target=SSK@aabbccddee/something
   ExtInfo.URI=CHK@eeddccbbaa
   End
   <XML blah blah>
   blah blah
   </XML>

This describes the same website as above, but with metadata in a trailing field. It's completely reasonable for multiple documents to share the same TrailingInfo metadata. (Since only one needs to be processed, this shouldn't be a problem)

Handling Other Commands


This section is Deprecated pending further revision of the spec


Other commands may be inserted if they follow the general structure of
metadata commands set out above.

These commands may set an Importance header of type string. "Required"
means that the client should ALWAYS stop processing if it doesn't
understand this extended command. "Informational" means that the
client should NEVER stop processing if it doesn't understand this
extended command. "Optional" means it's up to the client to
decide.

If not given, Importance defaults to Informational.

Since a client will process, for a given Name, the first
command it understands, you can do the following:

Note


WaitRedirect is an example of an extended command, not a speced command

Example 4. An Extended Command


  WaitRedirect.Time=5
  WaitRedirect.Target=CHK@gargargargar
  Redirect.Target=freenet:CHK@gargargargar

The WaitRedirect will override the Redirect, if the client understands
WaitRedirects.

-- SebastianSpaeth - 24 Feb 2002
Send spam to catchme@freenetproject.org ! :)