====== IDEA0 format definition ====== Keys use CamelCase, however to avoid confusion, they must be case insensitively unique within their parent object. When parsing, keys "ID", "id", "iD" and "Id" must be considered as equivalent. Each definition line is in form KEY: TYPE, followed by an explanation line, where type can be basic JSON type (in ''italics''), syntactically restricted type (with reference to [[#Types|Types]] chapter), or array of former two (order is important). Types define expected syntax, however their content may be further syntactically or semantically restricted according to particular key explanation. The keys ''Format'', ''ID'', ''DetectTime'' and ''Category'' are mandatory, rest of the keys is optional (nonexistent key indicates that information is not applicable or unknown). As human language may be ambiguous inadvertently or by omission, when in doubt, consult [[en/schema|JSON schema]]. ===== Definition ===== * **Format**: [[#Version|Version]]\\ Identifier of the IDEA container.\\ * **ID**: [[#ID|ID]]\\ Unique message identifier.\\ * **AltNames**: array of ''string''\\ Alternative identifiers; strings which help to pair the event to internal system information (for example tickets in request tracker systems).\\ * **CorrelID**: array of [[#ID|ID]]\\ Identifiers of messages, which are information sources for creation of this message in case the message has been created based on correlation/analysis/deduction of other messages.\\ * **AggrID**: array of [[#ID|ID]]\\ Identifiers of messages, which are aggregated into more concise form by this message. Should be sent mostly by intermediary nodes, which detect duplicates, or aggregate events, spanning multiple detection windows, into one longer.\\ * **PredID**: array of [[#ID|ID]]\\ Identifiers of messages, which are obsoleted and information in them is replaced by this message. Should be sent only by detection nodes to incorporate further data about ongoing event.\\ * **RelID**: array of [[#ID|ID]]\\ Otherwise related messages.\\ * **CreateTime**: [[#Timestamp|Timestamp]]\\ Timestamp of the creation of the IDEA message. May point out delay between detection and processing of data.\\ * **DetectTime**: [[#Timestamp|Timestamp]]\\ Timestamp of the moment of detection of event (not necessarily time of the event taking place). This timestamp is mandatory, because every detector is able to know when it detected the information - for example when line about event appeared in the logfile, or when its information source says the event was detected, or at least when it accepted the information from the source.\\ * **EventTime**: [[#Timestamp|Timestamp]]\\ Deduced start of the event/attack, or just time of the event if its solitary.\\ * **CeaseTime**: [[#Timestamp|Timestamp]]\\ Deduced end of the event/attack.\\ * **WinStartTime**: [[#Timestamp|Timestamp]]\\ Beginning of aggregation window in which event has been observed.\\ * **WinEndTime**: [[#Timestamp|Timestamp]]\\ End of aggregation window in which event has been observed.\\ * **ConnCount**: [[#Integer|Integer]]\\ Number of individual connections attempted or taken place.\\ * **FlowCount**: [[#Integer|Integer]]\\ Number of individual simplex (one direction) flows.\\ * **PacketCount**: [[#Integer|Integer]]\\ Number of individual packets transferred.\\ * **ByteCount**: [[#Integer|Integer]]\\ Number of bytes transferred.\\ * **Category**: array of [[#EventTag|EventTag]]\\ Category of event.\\ * **Ref**: array of [[#URI|URI]]\\ References to known sources, related to attack and/or vulnerability. May be URL of the additional info, or URN (according to [[http://tools.ietf.org/html/rfc2141|RFC 2141]]) in registered namespace ([[http://www.iana.org/assignments/urn-namespaces/urn-namespaces.xhtml|IANA]]) or unregistered ad-hoc namespace bearing reasonable information value and uniqueness, such as "urn:cve:CVE-2013-5634".\\ * **Confidence**: ''number''\\ Confidence of detector in its own reliability of this particular detection. (0 – surely false, 1 – no doubts). If key is not presented, detector does not know (or has no capability to estimate the confidence).\\ * **Description**: ''string''\\ Short free text human readable description.\\ * **Note**: ''string''\\ Free text human readable addidional note, possibly longer description of incident if not obvious.\\ * **Source**: array of ''object''\\ Information concerning particular source or target.\\ * **Type**: array of [[#SourceTargetTag|SourceTargetTag]]\\ Closer category of source/target.\\ * **Hostname**: array of ''string''\\ Hostname of this source/target. Should be FQDN, but may not conform exactly, because values, extracted from logs, messages, DNS, etc. may themselves be malformed. Empty array can be used to explicitly indicate that value has been inquired and not found (missing DNS name).\\ * **IP4**: array of [[#Net4|Net4]]\\ IPv4 addresses of this source/target.\\ * **MAC**: array of [[#MAC|MAC]]\\ MAC addresses of this source/target.\\ * **IP6**: array of [[#Net6|Net6]]\\ IPv6 addresses of this source/target.\\ * **Port**: array of [[#Integer|Integer]]\\ Source or destination ports affected.\\ * **Proto**: array of [[#ProtocolName|ProtocolName]]\\ Protocols, concerning connections from/to this source/target.\\ * **URL**: array of ''string''\\ Unified Resource Locator of this source/target. Should be formatted according to [[http://tools.ietf.org/html/rfc1738|RFC 1738]], [[http://tools.ietf.org/html/rfc1808|RFC 1808]] and related, however may not conform exactly, because values, extracted from logs, messages, etc. may themselves be malformed.\\ * **Email**: array of ''string''\\ Email address (for example Reply-To address in phishing message). Should be formatted according to [[http://tools.ietf.org/html/rfc5322#section-3.4|RFC 5322, section 3.4]] and related, however may not conform exactly, because values, extracted from logs, messages, DNS, etc. may themselves be malformed.\\ * **AttachHand**: array of [[#Handle|Handle]]\\ Identifiers of attachments related to this source/target - contain "Handle"s of related attachments.\\ * **Note**: ''string''\\ Free text human readable additional note.\\ * **Spoofed**: [[#Boolean|Boolean]]\\ Establishes whether this source/target is forged.\\ * **Imprecise**: [[#Boolean|Boolean]]\\ Establishes whether this source/target is knowingly imprecise.\\ * **Anonymised**: [[#Boolean|Boolean]]\\ Establishes whether this source/target is willingly incomplete.\\ * **ASN**: array of [[#Integer|Integer]]\\ Autonomous system number of this source/target.\\ * **Router**: array of ''string''\\ Router/interface path information. Intentionally organisation specific, router identifiers have usually no clear meaning outside organisational unit.\\ * **Netname**: array of [[#Netname|Netname]]\\ RIR database reference network identifier (for example "ripe:CESNET-BB2" or "arin:WETEMAA"). Common network identifiers are: ripe, arin, apnic, lacnic, afrinic. Empty array can be used to explicitly indicate that value has been inquired and not found (IP address from unassigned block).\\ * **Ref**: array of [[#URI|URI]]\\ References to known sources, related to attack and/or vulnerability, specific to this source/target. May be URL of the additional info, or URN (according to [[http://tools.ietf.org/html/rfc2141|RFC 2141]]) in registered namespace ([[http://www.iana.org/assignments/urn-namespaces/urn-namespaces.xhtml|IANA]]) or unregistered ad-hoc namespace bearing reasonable information value and uniqueness, such as "urn:cve:CVE-2013-2266".\\ * **Target**: array of ''object''\\ Information concerning particular source or target.\\ * **Type**: array of [[#SourceTargetTag|SourceTargetTag]]\\ Closer category of source/target.\\ * **Hostname**: array of ''string''\\ Hostname of this source/target. Should be FQDN, but may not conform exactly, because values, extracted from logs, messages, DNS, etc. may themselves be malformed. Empty array can be used to explicitly indicate that value has been inquired and not found (missing DNS name).\\ * **IP4**: array of [[#Net4|Net4]]\\ IPv4 addresses of this source/target.\\ * **MAC**: array of [[#MAC|MAC]]\\ MAC addresses of this source/target.\\ * **IP6**: array of [[#Net6|Net6]]\\ IPv6 addresses of this source/target.\\ * **Port**: array of [[#Integer|Integer]]\\ Source or destination ports affected.\\ * **Proto**: array of [[#ProtocolName|ProtocolName]]\\ Protocols, concerning connections from/to this source/target.\\ * **URL**: array of ''string''\\ Unified Resource Locator of this source/target. Should be formatted according to [[http://tools.ietf.org/html/rfc1738|RFC 1738]], [[http://tools.ietf.org/html/rfc1808|RFC 1808]] and related, however may not conform exactly, because values, extracted from logs, messages, etc. may themselves be malformed.\\ * **Email**: array of ''string''\\ Email address (for example Reply-To address in phishing message). Should be formatted according to [[http://tools.ietf.org/html/rfc5322#section-3.4|RFC 5322, section 3.4]] and related, however may not conform exactly, because values, extracted from logs, messages, DNS, etc. may themselves be malformed.\\ * **AttachHand**: array of [[#Handle|Handle]]\\ Identifiers of attachments related to this source/target - contain "Handle"s of related attachments.\\ * **Note**: ''string''\\ Free text human readable additional note.\\ * **Spoofed**: [[#Boolean|Boolean]]\\ Establishes whether this source/target is forged.\\ * **Imprecise**: [[#Boolean|Boolean]]\\ Establishes whether this source/target is knowingly imprecise.\\ * **Anonymised**: [[#Boolean|Boolean]]\\ Establishes whether this source/target is willingly incomplete.\\ * **ASN**: array of [[#Integer|Integer]]\\ Autonomous system number of this source/target.\\ * **Router**: array of ''string''\\ Router/interface path information. Intentionally organisation specific, router identifiers have usually no clear meaning outside organisational unit.\\ * **Netname**: array of [[#Netname|Netname]]\\ RIR database reference network identifier (for example "ripe:CESNET-BB2" or "arin:WETEMAA"). Common network identifiers are: ripe, arin, apnic, lacnic, afrinic. Empty array can be used to explicitly indicate that value has been inquired and not found (IP address from unassigned block).\\ * **Ref**: array of [[#URI|URI]]\\ References to known sources, related to attack and/or vulnerability, specific to this source/target. May be URL of the additional info, or URN (according to [[http://tools.ietf.org/html/rfc2141|RFC 2141]]) in registered namespace ([[http://www.iana.org/assignments/urn-namespaces/urn-namespaces.xhtml|IANA]]) or unregistered ad-hoc namespace bearing reasonable information value and uniqueness, such as "urn:cve:CVE-2013-2266".\\ * **Attach**: array of ''object''\\ Additional attachment information and data.\\ * **Handle**: [[#Handle|Handle]]\\ Message unique identifier for reference through Attach elements.\\ * **FileName**: array of ''string''\\ Names of the attached file.\\ * **Type**: array of [[#AttachmentTag|AttachmentTag]]\\ Type of the attached data.\\ * **Hash**: array of [[#Hash|Hash]]\\ Checksum of the content (for example "sha1:794467071687f7c59d033f4de5ece6b46415b633" or "md5:dc89f0b4ff9bd3b061dd66bb66c991b1").\\ * **Size**: [[#Integer|Integer]]\\ Length of the content.\\ * **Ref**: array of [[#URI|URI]]\\ References to known sources, related to attack and/or vulnerability, specific to this attachment. May be URL of the additional info, or URN (according to [[http://tools.ietf.org/html/rfc2141|RFC 2141]]) in registered namespace ([[http://www.iana.org/assignments/urn-namespaces/urn-namespaces.xhtml|IANA]]) or unregistered ad-hoc namespace bearing reasonable information value and uniqueness, such as "urn:clamav:Win.Trojan.Banker-14334".\\ * **Note**: ''string''\\ Free text human readable additional note.\\ * **ContentType**: [[#MediaType|MediaType]]\\ Internet Media Type of the attachment, according to [[http://tools.ietf.org/html/rfc2046|RFC 2046]] and related. Along with [[http://www.iana.org/assignments/media-types/media-types.xhtml|types standardized by IANA]] also non standard but widely used media types can be used (for examples see [[http://www.freeformatter.com/mime-types-list.html|MIME types list at freeformatter.com]]).\\ * **ContentCharset**: [[#Charset|Charset]]\\ Name of the content character set according to [[http://www.iana.org/assignments/character-sets/character-sets.xhtml|IANA list]]. If key is not defined, unspecified binary encoding is assumed.\\ * **ContentEncoding**: [[#Encoding|Encoding]]\\ Encoding of the content, if feasible. Nonexistent key means native JSON encoding.\\ * **Content**: ''string''\\ Attachment content.\\ * **ContentID**: array of ''string''\\ If content of attachment is transferred separately (in underlaying container), this key contains external ID of the content, so it can be paired back to message.\\ * **ExternalURI**: array of [[#URI|URI]]\\ If content of attachment is available and/or recognizable from external source, this is defining URI (usually URL). May also be URN (according to [[http://tools.ietf.org/html/rfc2141|RFC 2141]]) in registered namespace ([[http://www.iana.org/assignments/urn-namespaces/urn-namespaces.xhtml|IANA]]) or unregistered ad-hoc namespace bearing reasonable information value and uniqueness, such as "urn:mhr:55eaf7effadc07f866d1eaed9c64e7ee49fe081a", "magnet:?xt=urn:sha1:YNCKHTQCWBTRNJIV4WNAE52SJUQCZO5C".\\ * **Node**: array of ''object''\\ Detector or possible intermediary (event aggregator, correlator, etc.) description.\\ * **Name**: [[#NSID|NSID]]\\ Name of the detector, which must be reasonably unique, however still bear some meaningful sense. Usually denotes hierarchy of organisational units which detector belongs to and its own name.\\ * **Type**: array of [[#NodeTag|NodeTag]]\\ Tag, describing various facets of the detector.\\ * **SW**: array of ''string''\\ The name of the detection software (optionally including version). For example "labrea-2.5-stable-1" or "HP TippingPoint 7500NX".\\ * **AggrWin**: [[#Duration|Duration]]\\ The size of the aggregation window, if applicable.\\ * **Note**: ''string''\\ Free text human readable additional description.\\ ===== Types ===== ==== Boolean ==== JSON "true" or "false" value. ==== Integer ==== JSON "number" with no fractional and exponential part. ==== Version ==== Must contain string "IDEA0". (Trailing zero denotes draft version, after review/discussion and specification finalisation the name will change.) ==== MediaType ==== Internet media type without parameters. Format is type and subtype, separated by slash, where type can contain only alphanumeric, underscore and minus sign, and subtype can contain only alphanumeric, plus and minus sign, underscore and dot. ==== Charset ==== Character set name may consist of alphanumeric, dot, colon, minus sign, underscore and parentheses (round brackets). ==== Encoding ==== May contain only string "base64" (however note that key can be nonexistent, which means native encoding). ==== Handle ==== String value unique among all "Handle" element values. May contain only alphanumeric or underscore, must not start with number and must not be empty. ==== ID ==== String, containing reasonably globally unique identifier. UUID version 4 (random) or 5 (SHA-1) is recommended. As IDs are meant to be used at other mediums, transfer protocols and formats (an example being query string fields in URL), they are allowed to contain only reasonably safe subset of characters. May thus contain only alphanumeric, dot, minus sign and underscore and must not be empty. ==== Timestamp ==== String, containing timestamp conforming to [[http://tools.ietf.org/html/rfc3339|RFC 3339]]. ==== Duration ==== String, containing time offset, intended for representing difference between two timestamps. Format is time part of [[http://tools.ietf.org/html/rfc3339|RFC 3339]], optionally prepended by "D" or "d" separator and number of days (which can have arbitrary number number of digits). "D" separator has been chosen to distinguish from internet time, and as a memory aid for "duration" or "days". For example "536D10:20:30.5" means 536 days, 10 hours, 20 seconds, 30.5 seconds, whereas 00:05:00 represents five minutes. [[http://tools.ietf.org/html/rfc2234|ABNF]] syntax: time-hour = 2DIGIT ; 00-23 time-minute = 2DIGIT ; 00-59 time-second = 2DIGIT ; 00-59 time-secfrac = "." 1*DIGIT separator = "D" / "d" days = 1*DIGIT duration = [days separator] time-hour ":" time-minute ":" time-second [time-secfrac] ==== URI ==== String, containing URI as defined in [[http://tools.ietf.org/html/rfc3986|RFC 3986]] and related. ==== Net4 ==== String, containing IPv4 range in human readable form. Range can be specified as CIDR network ("192.0.2.0/24") or two IP addresses in dot-decimal notation, separated by minus sign ("192.0.2.0-192.0.2.255"). ==== Net6 ==== String, containing IPv6 range in human readable form. Range can be specified as CIDR notation ("2001:db8::/48") or two IP addresses in colon-hexadecimal notation, separated by minus sign ("2001:db8::-2001:db8:0:ffff:ffff:ffff:ffff:ffff"). ==== NSID ==== Namespaced identifier. Dot separated list of labels, with significance from left to right – leftmost denoting largest containing realm, rightmost denoting single entity. Country – organisation – suborganizations – machine – local scheme akin to "org.example.csirt.northwest.honeypot.jabberwock" is strongly recommended. Label case is insignificant, label can contain only letters, numbers or underscore and must not start with number. ==== MAC ==== String, containing MAC address in human friendly form - six groups of two hexadecimal digits, separated by colon. ==== Netname ==== URI string, containing LIR identifier and network identifier within LIR namespace, separated by colon. ==== Hash ==== URI string, defining hash type and hash value, separated by colon. ==== EventTag ==== Category name consists of one or two abbreviated parts - category and optional subcategory, separated by dot. If unsure of more precise nature of the incident, subcategory and dot may be omitted. Category and subcategory name must contain only alphanumeric, underscore and minus sign. For semantics and taxonomy see [[en/classifications#eventtagsecurity_event_types_classification|security event types classification]]. ==== ProtocolName ==== Name must not be empty, must contain only alphanumeric and minus sign, must contain at least one letter, must not begin or end with a hyphen and two hyphens must not be adjacent. For semantics and applicable strings see [[en/classifications#protocolnameprotocols_classification|protocols classification]]. ==== SourceTargetTag ==== Tag name must contain only alphanumeric, underscore and minus sign. For semantics and taxonomy see [[en/classifications#sourcetargettagsourcetarget_classification|source/target classification]]. ==== NodeTag ==== Tag name must contain only alphanumeric, underscore and minus sign. For semantics and taxonomy see [[en/classifications#nodetagclassification_of_detection_nodes|classification of detection nodes]]. ==== AttachmentTag ==== Tag name must contain only alphanumeric, underscore and minus sign. For semantics and taxonomy see [[en/classifications#attachmenttagattachment_description|attachment description]].