aerolab

Supported Formats by GAIA AGI PreProcessor

Default

Raw Aerospike log format, as printed by the asd binary. Each node must be logged in separate file, as the log lines do not carry node markers.

Typically raw logs can be retrieved using:

Tab-separated

Type 1

Multiple nodes from the same cluster can be concatenated into a single log file. Separate clusters should be in separate log files.

field 1 field 2 field 3
freeform (ignored) node name log line

Example:

sometext        node_a1        May 19 2021 23:59:57 GMT: INFO (nsup): (nsup.c:402) {test} nsup-start: expire-threads 1
sometext        node_a2        May 19 2021 23:59:59 GMT: INFO (nsup): (nsup.c:402) {test} nsup-start: expire-threads 1
sometext        node_a1        May 19 2021 23:59:57 GMT: INFO (nsup): (nsup.c:814) {test} nsup-done: non-expirable 0 expired (4474593206,22157)

Type 2

Multiple nodes and clusters can be concatenated into a single log file.

field 1 field 2 field 3 field 4
freeform (ignored) node name cluster name log line

Example:

sometext        node_a1        prod_cluster        May 19 2021 23:59:57 GMT: INFO (nsup): (nsup.c:402) {test} nsup-start: expire-threads 1
sometext        node_a2        prod_cluster        May 19 2021 23:59:59 GMT: INFO (nsup): (nsup.c:402) {test} nsup-start: expire-threads 1
sometext        node_a1        prod_cluster        May 19 2021 23:59:57 GMT: INFO (nsup): (nsup.c:814) {test} nsup-done: non-expirable 0 expired (4474593206,22157)

JSON formats

GKE default logger

Supported log json definition is defined below. Optional json fields may be provided and will be ignored. One line per Aerospike log line. Multiple node logs may be encapsulated in the same json log file.

Do NOT encapsulate in list enumerator [].

{"textPayload": "LOG_LINE","resource":{"labels":{"pod_name": "NODE_NAME"}}}
{"textPayload": "LOG_LINE","resource":{"labels":{"pod_name": "NODE_NAME_2"}}}

Example:

{"textPayload": "May 19 2021 23:59:57 GMT: INFO (nsup): (nsup.c:402) {test} nsup-start: expire-threads 1","resource":{"labels":{"pod_name": "node_a1"}}}
{"textPayload": "May 19 2021 23:59:59 GMT: INFO (nsup): (nsup.c:402) {test} nsup-start: expire-threads 1","resource":{"labels":{"pod_name": "node_a2"}}}
{"textPayload": "May 19 2021 23:59:57 GMT: INFO (nsup): (nsup.c:814) {test} nsup-done: non-expirable 0 expired (4474593206,22157)","resource":{"labels":{"pod_name": "node_a1"}}}

K8s JSON default logs

Each node must be logged in seaprate JSON file, as node markers are not present in this format. Optional json fields may be provided and will be ignored. One line per Aerospike log line.

Do NOT encapsulate in list enumerator [].

{"log": "LOG_LINE"}

Example:

{"log": "May 19 2021 23:59:57 GMT: INFO (nsup): (nsup.c:402) {test} nsup-start: expire-threads 1"}
{"log": "May 19 2021 23:59:57 GMT: INFO (nsup): (nsup.c:814) {test} nsup-done: non-expirable 0 expired (4474593206,22157)"}

K8s JSON alternative log format

Each node must be logged in seaprate JSON file, as node markers are not present in this format. Optional json fields may be provided and will be ignored. One line per Aerospike log line.

Do NOT encapsulate in list enumerator [].

{"jsonPayload":{"log": "LOG_LINE"}}

Example:

{"jsonPayload":{"log": "May 19 2021 23:59:57 GMT: INFO (nsup): (nsup.c:402) {test} nsup-start: expire-threads 1"}}
{"jsonPayload":{"log": "May 19 2021 23:59:57 GMT: INFO (nsup): (nsup.c:814) {test} nsup-done: non-expirable 0 expired (4474593206,22157)"}}

AWS CloudWatch encapsulation

Note that AWS CloudWatch CSV encapsulation is not supported. If exporting logs from Cloudwatch, or CloudWatch->S3, logs must first be sanitised by removing AWS encapsulation.

Fluentd k8s logger - split format parser

Further to the above logging formats, separated line logging is also supported.

Each node must be logged in seaprate JSON file, as node markers are not present in this format. Optional json fields may be provided and will be ignored. One line per Aerospike log line. JSON format is as follows:

Do NOT encapsulate in list enumerator [].

{
    "timestamp": "LOG_TIMESTAMP_FORMAT:2006-01-02T15:04:05Z",
    "jsonPayload":{
        "level": "LOG_LEVEL_FROM_AEROSPIKE_LOG_LINE",
        "module": "LOG_MODULE_FROM_AEROSPIKE_LOG_LINE",
        "module_detail": "LOG_MODULE_DETAIL_FROM_AEROSPIKE_LOG_LINE",
        "message": "REST_OF_LOG_MESSAGE_FROM_AEROSPIKE_LOG_LINE"
    }
}

Example:

{"timestamp": "2021-05-19T23:59:57Z","jsonPayload":{"level": "INFO","module": "nsup","module_detail": "nsup.c:402","message": "{test} nsup-start: expire-threads 1"}}
{"timestamp": "2021-05-19T23:59:57Z","jsonPayload":{"level": "INFO","module": "nsup","module_detail": "nsup.c:814","message": "nsup-done: non-expirable 0 expired (4474593206,22157)"}}

Getting logs in the right format from GKE

Option 1 - flattened json

gcloud logging read ... --order=asc --format="json(timestamp,resource.labels.pod_name,textPayload)" |jq -c '.[]'

Option 2 - tab-separated for a single cluster

gcloud logging read ... --order=asc --format="csv[separator='\t'](timestamp,resource.labels.pod_name,textPayload)"

Option 3 - tab-separated for multiple clusters

gcloud logging read ... --order=asc --format="csv[separator='\t'](timestamp,resource.labels.pod_name,resource.labels.cluster_name,textPayload)"