A Logging Standard

Format

Use JSON

Human readable(ish)
Machine readable
Abundant tooling
- jq
- Document stores like Elastic Seach
- almost every programming langauge

Common Data Structures

Interoperability enhanced by adopting consistent data structures (detailed below).

Include Provenance IDs.

Provenance IDs

What is a Provenance ID?

Provide a Universally Unique ID (but not a UUID)
Provide a calling context
- Clear relationship between a parent ID and its children.

ID Generation

If a provenance ID is provided with a request, use it as the parent of all work.

If no ID is found, generate a new ID from your service's Base ID and use it.

Adoption patterns

As services add support, we will be able to make connections between systems, but no system will fail to track traffic while waiting for upstream services to implement tracking.

Web Service Implementation

HTTP/HTTPS services need to check for the X-Bean-ProvenanceId header.

Structure

Base ID
- Any alpha-numeric string
- Something like "SuperProductTypeCGI"
- One scheme that works well is "Product-Service" like "ABC-Superprod"
Unique Element
- Any string that provides uniqueness
- UUID is convenient, base64 encoding can keep it shorter.
- But any string that is unique across multiple systems running with the same base ID.
Request Path
- Each child request adds a segment with a monotonically increasing value.
- So the first child gets any natural number. Typically 1
- Any subsequent children must get a larger number. Typically the previous value, incremented by one.
- Local and remote children can have separate or shared counters.
- New items are appended as a comma delimited list.

ABNF Grammar

provenance-id = base-id "." unique-element request-path

base-id = ALPHA *safe-character

unique-element = *safe-character

request-path = ":" request-list
request-path =/ ""

request-path = local-request request-tail
request-path =/ remote-request request-tail

request-tail = "," 
request-tail =/ ""

local-request = number

remote-work = number "r"

number = non-zero-digit *DIGIT

non-zero-digit = "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"

safe-character = ALPHA / DIGIT / safe-symbol

; Any printing, non-alphanumeric character isn't ,: or whitespace
safe-symbol =  "!" / DQUOTE / "#" /"$" / "%" / "^" / "&" / "'" / "(" ")" / "*" / "+"
safe-symbol =/ "-" / "." 
safe-symbol =/ "/"
safe-symbol =/ "<" / "=" / ">" / "?" / "@"
safe-symbol =/ "[" / "\" / "^" / "_" / "`"
safe-symbol =/ "{" / "|" / "}" / "~"

Example

Base ID: ABC-Superprod

Unique Element: 8ed56f13-bd6c-4f47-a8eb-94604e07e6ac

New request from customer: ABC-Superprod.8ed56f13-bd6c-4f47-a8eb-94604e07e6ac
Internal call to client config: ABC-Superprod.8ed56f13-bd6c-4f47-a8eb-94604e07e6ac:3
Client config looks at User config: ABC-Superprod.8ed56f13-bd6c-4f47-a8eb-94604e07e6ac.3.12r

Log Entry Objects

Log Item Types

Log Entry

[ Log Header, Log Body ]

Log Header

[   ISO8601-Time with milliseconds in UTC,
    Log Type Name, 
    hostname,
    program name,
    process ID, 
    thread ID,
    Provenance ID,
    Context name
]

Log Body

{   Log Type Name => Log Type Data,
    file => name of file where log entry generated,
    line => line number of log entry,
    method => name of method/function that generated the entry,
}

Event

Type Name: EVENT
{    message => A string describing the event
}

Unit Of Work

Type Name: UOW
{   start    => ISO6801 date time stamp with milliseconds
    end      => ISO6801 date time stamp with milliseconds
    duration => duration in milliseconds, integer value
    result   => result type identifier
    metrics  => dictionary, keys are strings values are metric objects
    values   => dictionary, keys are strings, values are strings
}

Metric

{    units => string
     count => integer, the number of times the metric was incremented
     total => sum of all values assigned to the metric
}

Result Types

NORMAL - successful completion of work

INVALID - Unit of Work terminated improperly

FAILURE - Unit of Work could not be completed, for example, the requested file does not exist.

EXCEPTION - Unit of Work generated an exception and could not be completed.

What Next?

Early to Mid-November

Review Object definitions
Prepare JSON Schema for object definitions
Prepare enhanced demo.
Meet with Tech Teams (US and UK) and cover proposal.

The Future

Collate feedback from Tech Teams.
Finalize object and document all data structures
Adapt Log::Work to work with existing BB logging facilities
Work with Operations to ensure that their tools can digest generated logs
Work with other teams to get adoption of the standard
Create Provenance ID tooling in Node.
- Should client side code be able to generate Provenance IDs?
  - Lean strongly towards "no" - risk of duplicate IDs being injected maliciously.
  - Instead use session IDs as a correlation ID for UI interaction.
  - Extending tracking would be nice, but need a way to do it safely.
Set up proxies/firewalls to strip provenance ID headers from outgoing requests/responses.
- May need exceptions to handle cases where data round trips through customer servers/services.

Development notes

Jump to

Quick reference

How to do logging and monitoring

A Logging Standard

Format

Use JSON

Common Data Structures

Provenance IDs

What is a Provenance ID?

ID Generation

Adoption patterns

Web Service Implementation

Structure

ABNF Grammar

Example

Log Entry Objects

Log Item Types

Log Entry

Log Header

Log Body

Event

Unit Of Work

Metric

Result Types

What Next?

Early to Mid-November

The Future

See Also

0. Requests & responses

1. Unexpected exception

2. Expected fatal exception

3. Expected non-fatal, actionable error

4. Expected non-fatal non-actionable error

5. Expected WARN or INFO level events

Systems

Search this blog

Highlights

Quick reference

Links

Blog Archive

Subscribe To

Tags