IDEA0 format definition

Keys use CamelCase, however to avoid confusion, they must be case insensitively unique within their parent object. When parsing, keys “ID”, “id”, “iD” and “Id” must be considered as equivalent.

Each definition line is in form KEY: TYPE, followed by an explanation line, where type can be basic JSON type (in italics), syntactically restricted type (with reference to Types chapter), or array of former two (order is important). Types define expected syntax, however their content may be further syntactically or semantically restricted according to particular key explanation.

The keys Format, ID, DetectTime and Category are mandatory, rest of the keys is optional (nonexistent key indicates that information is not applicable or unknown).

As human language may be ambiguous inadvertently or by omission, when in doubt, consult JSON schema.

Definition

Format: Version
Identifier of the IDEA container.
ID: ID
Unique message identifier.
AltNames: array of string
Alternative identifiers; strings which help to pair the event to internal system information (for example tickets in request tracker systems).
CorrelID: array of ID
Identifiers of messages, which are information sources for creation of this message in case the message has been created based on correlation/analysis/deduction of other messages.
AggrID: array of ID
Identifiers of messages, which are aggregated into more concise form by this message. Should be sent mostly by intermediary nodes, which detect duplicates, or aggregate events, spanning multiple detection windows, into one longer.
PredID: array of ID
Identifiers of messages, which are obsoleted and information in them is replaced by this message. Should be sent only by detection nodes to incorporate further data about ongoing event.
RelID: array of ID
Otherwise related messages.
CreateTime: Timestamp
Timestamp of the creation of the IDEA message. May point out delay between detection and processing of data.
DetectTime: Timestamp
Timestamp of the moment of detection of event (not necessarily time of the event taking place). This timestamp is mandatory, because every detector is able to know when it detected the information - for example when line about event appeared in the logfile, or when its information source says the event was detected, or at least when it accepted the information from the source.
EventTime: Timestamp
Deduced start of the event/attack, or just time of the event if its solitary.
CeaseTime: Timestamp
Deduced end of the event/attack.
WinStartTime: Timestamp
Beginning of aggregation window in which event has been observed.
WinEndTime: Timestamp
End of aggregation window in which event has been observed.
ConnCount: Integer
Number of individual connections attempted or taken place.
FlowCount: Integer
Number of individual simplex (one direction) flows.
PacketCount: Integer
Number of individual packets transferred.
ByteCount: Integer
Number of bytes transferred.
Category: array of EventTag
Category of event.
Ref: array of URI
References to known sources, related to attack and/or vulnerability. May be URL of the additional info, or URN (according to RFC 2141) in registered namespace (IANA) or unregistered ad-hoc namespace bearing reasonable information value and uniqueness, such as “urn:cve:CVE-2013-5634”.
Confidence: number
Confidence of detector in its own reliability of this particular detection. (0 – surely false, 1 – no doubts). If key is not presented, detector does not know (or has no capability to estimate the confidence).
Description: string
Short free text human readable description.
Note: string
Free text human readable addidional note, possibly longer description of incident if not obvious.
Source: array of object
Information concerning particular source or target.
- Type: array of SourceTargetTag
  Closer category of source/target.
- Hostname: array of string
  Hostname of this source/target. Should be FQDN, but may not conform exactly, because values, extracted from logs, messages, DNS, etc. may themselves be malformed. Empty array can be used to explicitly indicate that value has been inquired and not found (missing DNS name).
- IP4: array of Net4
  IPv4 addresses of this source/target.
- MAC: array of MAC
  MAC addresses of this source/target.
- IP6: array of Net6
  IPv6 addresses of this source/target.
- Port: array of Integer
  Source or destination ports affected.
- Proto: array of ProtocolName
  Protocols, concerning connections from/to this source/target.
- URL: array of string
  Unified Resource Locator of this source/target. Should be formatted according to RFC 1738, RFC 1808 and related, however may not conform exactly, because values, extracted from logs, messages, etc. may themselves be malformed.
- Email: array of string
  Email address (for example Reply-To address in phishing message). Should be formatted according to RFC 5322, section 3.4 and related, however may not conform exactly, because values, extracted from logs, messages, DNS, etc. may themselves be malformed.
- AttachHand: array of Handle
  Identifiers of attachments related to this source/target - contain “Handle”s of related attachments.
- Note: string
  Free text human readable additional note.
- Spoofed: Boolean
  Establishes whether this source/target is forged.
- Imprecise: Boolean
  Establishes whether this source/target is knowingly imprecise.
- Anonymised: Boolean
  Establishes whether this source/target is willingly incomplete.
- ASN: array of Integer
  Autonomous system number of this source/target.
- Router: array of string
  Router/interface path information. Intentionally organisation specific, router identifiers have usually no clear meaning outside organisational unit.
- Netname: array of Netname
  RIR database reference network identifier (for example “ripe:CESNET-BB2” or “arin:WETEMAA”). Common network identifiers are: ripe, arin, apnic, lacnic, afrinic. Empty array can be used to explicitly indicate that value has been inquired and not found (IP address from unassigned block).
- Ref: array of URI
  References to known sources, related to attack and/or vulnerability, specific to this source/target. May be URL of the additional info, or URN (according to RFC 2141) in registered namespace (IANA) or unregistered ad-hoc namespace bearing reasonable information value and uniqueness, such as “urn:cve:CVE-2013-2266”.
Target: array of object
Information concerning particular source or target.
- Type: array of SourceTargetTag
  Closer category of source/target.
- Hostname: array of string
  Hostname of this source/target. Should be FQDN, but may not conform exactly, because values, extracted from logs, messages, DNS, etc. may themselves be malformed. Empty array can be used to explicitly indicate that value has been inquired and not found (missing DNS name).
- IP4: array of Net4
  IPv4 addresses of this source/target.
- MAC: array of MAC
  MAC addresses of this source/target.
- IP6: array of Net6
  IPv6 addresses of this source/target.
- Port: array of Integer
  Source or destination ports affected.
- Proto: array of ProtocolName
  Protocols, concerning connections from/to this source/target.
- URL: array of string
  Unified Resource Locator of this source/target. Should be formatted according to RFC 1738, RFC 1808 and related, however may not conform exactly, because values, extracted from logs, messages, etc. may themselves be malformed.
- Email: array of string
  Email address (for example Reply-To address in phishing message). Should be formatted according to RFC 5322, section 3.4 and related, however may not conform exactly, because values, extracted from logs, messages, DNS, etc. may themselves be malformed.
- AttachHand: array of Handle
  Identifiers of attachments related to this source/target - contain “Handle”s of related attachments.
- Note: string
  Free text human readable additional note.
- Spoofed: Boolean
  Establishes whether this source/target is forged.
- Imprecise: Boolean
  Establishes whether this source/target is knowingly imprecise.
- Anonymised: Boolean
  Establishes whether this source/target is willingly incomplete.
- ASN: array of Integer
  Autonomous system number of this source/target.
- Router: array of string
  Router/interface path information. Intentionally organisation specific, router identifiers have usually no clear meaning outside organisational unit.
- Netname: array of Netname
  RIR database reference network identifier (for example “ripe:CESNET-BB2” or “arin:WETEMAA”). Common network identifiers are: ripe, arin, apnic, lacnic, afrinic. Empty array can be used to explicitly indicate that value has been inquired and not found (IP address from unassigned block).
- Ref: array of URI
  References to known sources, related to attack and/or vulnerability, specific to this source/target. May be URL of the additional info, or URN (according to RFC 2141) in registered namespace (IANA) or unregistered ad-hoc namespace bearing reasonable information value and uniqueness, such as “urn:cve:CVE-2013-2266”.
Attach: array of object
Additional attachment information and data.
- Handle: Handle
  Message unique identifier for reference through Attach elements.
- FileName: array of string
  Names of the attached file.
- Type: array of AttachmentTag
  Type of the attached data.
- Hash: array of Hash
  Checksum of the content (for example “sha1:794467071687f7c59d033f4de5ece6b46415b633” or “md5:dc89f0b4ff9bd3b061dd66bb66c991b1”).
- Size: Integer
  Length of the content.
- Ref: array of URI
  References to known sources, related to attack and/or vulnerability, specific to this attachment. May be URL of the additional info, or URN (according to RFC 2141) in registered namespace (IANA) or unregistered ad-hoc namespace bearing reasonable information value and uniqueness, such as “urn:clamav:Win.Trojan.Banker-14334”.
- Note: string
  Free text human readable additional note.
- ContentType: MediaType
  Internet Media Type of the attachment, according to RFC 2046 and related. Along with types standardized by IANA also non standard but widely used media types can be used (for examples see MIME types list at freeformatter.com).
- ContentCharset: Charset
  Name of the content character set according to IANA list. If key is not defined, unspecified binary encoding is assumed.
- ContentEncoding: Encoding
  Encoding of the content, if feasible. Nonexistent key means native JSON encoding.
- Content: string
  Attachment content.
- ContentID: array of string
  If content of attachment is transferred separately (in underlaying container), this key contains external ID of the content, so it can be paired back to message.
- ExternalURI: array of URI
  If content of attachment is available and/or recognizable from external source, this is defining URI (usually URL). May also be URN (according to RFC 2141) in registered namespace (IANA) or unregistered ad-hoc namespace bearing reasonable information value and uniqueness, such as “urn:mhr:55eaf7effadc07f866d1eaed9c64e7ee49fe081a”, “magnet:?xt=urn:sha1:YNCKHTQCWBTRNJIV4WNAE52SJUQCZO5C”.
Node: array of object
Detector or possible intermediary (event aggregator, correlator, etc.) description.
- Name: NSID
  Name of the detector, which must be reasonably unique, however still bear some meaningful sense. Usually denotes hierarchy of organisational units which detector belongs to and its own name.
- Type: array of NodeTag
  Tag, describing various facets of the detector.
- SW: array of string
  The name of the detection software (optionally including version). For example “labrea-2.5-stable-1” or “HP TippingPoint 7500NX”.
- AggrWin: Duration
  The size of the aggregation window, if applicable.
- Note: string
  Free text human readable additional description.

Types

Boolean

JSON “true” or “false” value.

Integer

JSON “number” with no fractional and exponential part.

Version

Must contain string “IDEA0”. (Trailing zero denotes draft version, after review/discussion and specification finalisation the name will change.)

MediaType

Internet media type without parameters. Format is type and subtype, separated by slash, where type can contain only alphanumeric, underscore and minus sign, and subtype can contain only alphanumeric, plus and minus sign, underscore and dot.

Charset

Character set name may consist of alphanumeric, dot, colon, minus sign, underscore and parentheses (round brackets).

Encoding

May contain only string “base64” (however note that key can be nonexistent, which means native encoding).

Handle

String value unique among all “Handle” element values. May contain only alphanumeric or underscore, must not start with number and must not be empty.

ID

String, containing reasonably globally unique identifier. UUID version 4 (random) or 5 (SHA-1) is recommended. As IDs are meant to be used at other mediums, transfer protocols and formats (an example being query string fields in URL), they are allowed to contain only reasonably safe subset of characters. May thus contain only alphanumeric, dot, minus sign and underscore and must not be empty.

Timestamp

String, containing timestamp conforming to RFC 3339.

Duration

String, containing time offset, intended for representing difference between two timestamps. Format is time part of RFC 3339, optionally prepended by “D” or “d” separator and number of days (which can have arbitrary number number of digits). “D” separator has been chosen to distinguish from internet time, and as a memory aid for “duration” or “days”. For example “536D10:20:30.5” means 536 days, 10 hours, 20 seconds, 30.5 seconds, whereas 00:05:00 represents five minutes.

ABNF syntax:

time-hour       = 2DIGIT  ; 00-23
time-minute     = 2DIGIT  ; 00-59
time-second     = 2DIGIT  ; 00-59
time-secfrac    = "." 1*DIGIT
separator       = "D" / "d"
days            = 1*DIGIT

duration        = [days separator] time-hour ":" time-minute ":" time-second [time-secfrac]

URI

String, containing URI as defined in RFC 3986 and related.

Net4

String, containing IPv4 range in human readable form. Range can be specified as CIDR network (“192.0.2.0/24”) or two IP addresses in dot-decimal notation, separated by minus sign (“192.0.2.0-192.0.2.255”).

Net6

String, containing IPv6 range in human readable form. Range can be specified as CIDR notation (“2001:db8::/48”) or two IP addresses in colon-hexadecimal notation, separated by minus sign (“2001:db8::-2001:db8:0:ffff:ffff:ffff:ffff:ffff”).

NSID

Namespaced identifier. Dot separated list of labels, with significance from left to right – leftmost denoting largest containing realm, rightmost denoting single entity. Country – organisation – suborganizations – machine – local scheme akin to “org.example.csirt.northwest.honeypot.jabberwock” is strongly recommended. Label case is insignificant, label can contain only letters, numbers or underscore and must not start with number.

MAC

String, containing MAC address in human friendly form - six groups of two hexadecimal digits, separated by colon.

Netname

URI string, containing LIR identifier and network identifier within LIR namespace, separated by colon.

Hash

URI string, defining hash type and hash value, separated by colon.

EventTag

Category name consists of one or two abbreviated parts - category and optional subcategory, separated by dot. If unsure of more precise nature of the incident, subcategory and dot may be omitted. Category and subcategory name must contain only alphanumeric, underscore and minus sign.

For semantics and taxonomy see security event types classification.

ProtocolName

Name must not be empty, must contain only alphanumeric and minus sign, must contain at least one letter, must not begin or end with a hyphen and two hyphens must not be adjacent.

For semantics and applicable strings see protocols classification.

SourceTargetTag

Tag name must contain only alphanumeric, underscore and minus sign.

For semantics and taxonomy see source/target classification.

NodeTag

Tag name must contain only alphanumeric, underscore and minus sign.

For semantics and taxonomy see classification of detection nodes.

AttachmentTag

Tag name must contain only alphanumeric, underscore and minus sign.

For semantics and taxonomy see attachment description.