Development notes

Define text editor

$ sudo update-alternatives --config editor

Edit /etc/sudoers file

$ sudo visudo /etc/sudoers

Add a user to sudo group

$ sudo usermod -aG sudo username
or
$ sudo gpasswd -a username sudo

(but CentOS uses "wheel" group instead)

Default config for Ubuntu 22:

# User privilege specification
root    ALL=(ALL:ALL) ALL

# Members of the admin group may gain root privileges
%admin ALL=(ALL) ALL

# Allow members of group sudo to execute any command
%sudo   ALL=(ALL:ALL) ALL

Remove login for default users:

$ usermod -s /usr/sbin/nologin username

Sources:

Links

Docs entry point

Test2::Tools::Compare
- is like isnt unlike
- match mismatch validator
- hash array bag object meta number float rounded within string subset bool
- in_set not_in_set check_set
- item field call call_list call_hash prop check all_items all_keys all_vals all_values
- etc end filter_items
- T F D DF E DNE FDNE U L
- event fail_events
- exact_ref

Summary

=head2 SYNOPSIS

use Test2::V0;

# Match regex in a hash
like( $some_hash, hash {                        # <-- Must use `like` keyword to make regex below work
    field 'message' => qr/Caught exception/;    # <-- Note `field` keyword and trailing semicolon. Quotes around key optional
    end();                                      # <-- Enforce that no other keys are present, optional
}, 'Logged error correctly');                   # <-- Test description and parentheses () are optional

Comparisons

is(
    {
        a => 1,
        b => 'foo',
    },
    {
        a => D(),   # value is Defined
        b => E(),   # value Exists
        c => DNE(), # key/value Does Not Exist
    },
    'I can haz Test2?'
);

Todo - legacy

use Test::More;

TODO: {
    local $TODO = 'Still working on this';
    ok(0, "work in progress");
}

Todo2

use Test2::Tools::Tiny qw/todo/;

todo 'Still working on this' => sub {
    ok(0, "failing test gonna fail");
}; # <--- remember the semicolon!

See t/regression/todo_and_facets.t

jq docs

jq playground

Normal mode

Exploring HAR files

export HAR_FILE="/path/to/har/file"

Example to dump responses, for a given request URI
- REQUEST_URI="https://www.facebook.com/api/graphql/" cat $HAR_FILE | jq -r ".log.entries[] | if .request.url | test(\"$REQUEST_URI\") then .response.content else empty end"
  - Note the string passed to jq is in double quotes " so that the $REQUEST_URI is interpolated
  - But jq wants us to use double quotes for test("foo"), therefore they must be escaped like test(\"foo\")
Another way to do the same thing in bash using single quotes. Quotes can be tricky.
- REQUEST_URI="https://www.facebook.com/api/graphql/" cat $HAR_FILE | jq -r '.log.entries[] | if .request.url | test("'$REQUEST_URI'") then { uri: .request.url, mineType: .response.content.mimeType, content: .response.content.text | .[0:200] } else empty end'
- Note the string passed to jq is in three parts:
  - '...etc...test("'
  - $REQUEST_URI
  - '") then...etc...else empty end'
- The content is truncated to the first 200 characters, to make it more readable
Dump full the response content, interpreted as JSON
- ...todo...

Streaming mode

...todo

Case studies

Youtube

Goal: Extract URLs of all your playlists

(under development)

Go to https://music.youtube.com/library/playlists in browser, scroll slowly down to the bottom
- Chrome | DevTools | Network tab | Save all as HAR
Extract response text for relevant requests
- cat $HAR_FILE | jq -r '.log.entries[] | select( .request.url | test("^https://music.youtube.com/youtubei") ) | .response.content.text' > $REQS_FILE
Approach 1: Loop over lines of file and extract playlistIDs (status: draft -- this gets playlist titles)
- cat $REQS_FILE | while read line; do echo "$line" | jq '.contents.singleColumnBrowseResultsRenderer.tabs[].tabRenderer.content.sectionListRenderer.contents[].musicCarouselShelfRenderer.contents[].musicTwoRowItemRenderer.title.runs[] | { name: .text, id: .navigationEndpoint.browseEndpoint.browseId }'; done > $PLAYLISTS_FILE
- Bugs:
  - duplicate values
  - jq errors
  - last 5 entries are irrelevant
  - missing most entries!
Approach 2: Scan for all relevant playlist IDs, wherever they are in the document
- cat playlists.2 | jq -r 'getpath( paths | select(.[-1] == "browseId") ) | select(. | match("^VLPL"))'
- Bugs:
  - jq error: parse error: Invalid numeric literal at line 11, column 0
  - missing some entries
Approach 3: Give up and use Perl regex
- cat $REQS_FILE | perl -lne'@ids = m/"browseId":"([^"]+)"/g; print $_ foreach map { s/^VL//; $_ } grep { /^VLPL/ && length($_) > 22 } @ids' | uniq > $PLAYLISTS_FILE
- Bugs:
  - This was supposed to be a jq cheat sheet, using Perl is cheating!
  - It still misses some playlists from the initial page load.
Approach 4: Found another source of data in the page
- cat $HAR_FILE | jq -r '.log.entries[] | select( .request.url | test("^https://music.youtube.com/library/playlists") ) | .response.content.text' > $SCRIPT_DATA
- Decode it
  - cat $SCRIPT_DATA | perl -plne's/(\\x[[:xdigit:]]{2})/qq{"$1"}/eeg' > $DECODED_SCRIPT_DATA
- Maybe little bit of manual munging :/
- ...TODO... extract the browseIDs

AlternativeTo

Goal: Extract list of alternative software

Fetch JSON

Go to https://alternativeto.net/software/gmail
F12 | DevTools | Network tab | filter by Fetch/XHR
Scroll to the bottom and click 'Show more alternatives'. Repeat.
DevTools | (down arrow) | Save

Extract data

export REGEX="software/gmail.json"; cat alternativeto.net.har | jq -r ".log.entries[] | if .request.url | test(\"$REGEX\") then .response.content.text else empty end" > page_per_line
- this results in 9 lines, one for each 'page' you loaded
change the [] above to [0] to get one page, and pipe the result through jq again or use the fromjson filter as follows:
- export REGEX="software/gmail.json"; cat alternativeto.net.har | jq -r ".log.entries[0] | if .request.url | test(\"$REGEX\") then .response.content.text | fromjson else empty end" > one_page_one_line
Now browse this JSON data, preferably in an IDE like vscode that can fold up sections easily to discover the following structure:
- export REGEX="software/gmail.json"; cat alternativeto.net.har | jq -r ".log.entries[] | if .request.url | test(\"$REGEX\") then .response.content.text | fromjson | .pageProps.items[] | { name: .name, cost: .licenseCost, model: .licenseModel, desc: .shortDescriptionOrTagLine } else empty end" > software.json

Sample output

{
  "name": "Mailfence",
  "cost": "Freemium",
  "model": "Proprietary",
  "desc": "Mailfence is a secure and private email service that fights for online privacy and digital freedom."
}
{
  "name": "Proton Mail",
  "cost": "Freemium",
  "model": "Open Source",
  "desc": "Secure email with absolutely no compromises, brought to you by MIT and CERN scientists."
}
...etc

Tips, tricks and gotchas

Decode HTML entities

e.g. converts AT&T Webmail to AT&T Webmail

npm install -g he
cat software.json | jq '.name' -r | he --decode

Debugging

For very simple test examples, you must quote inputs twice, i.e. pass "foo" with quotes

echo '"hello"' | jq '.'

Regex. gsub = global substitution. Note the semicolon ; to separate arguments to gsub().

echo '"foo\r\nbar"' | jq -r 'gsub("(\r\n.+)"; "")'

	use strict;
	use warnings;

	use Data::Dumper::Concise;
	use Search::Elasticsearch;

	my $ekk = Search::Elasticsearch->new(
	client => '7_0::Direct',
	nodes => [ 'https://opensearch.example.com/', ],
	send_get_body_as => 'POST',
	$ENV{USE_EKK_PROXY} ? (handle_args => { https_proxy => $ENV{USE_EKK_PROXY} }) : (),
	);

	my $results = $ekk->search(
	body => {
	"size" => 500,
	"sort" => [
	{
	"timestamp" => {
	"order" => "desc",
	"unmapped_type" => "boolean"
	}
	}
	],
	"aggs" => {
	"2" => {
	"date_histogram" => {
	"field" => "timestamp",
	"calendar_interval" => "1d",
	"time_zone" => "Europe/London",
	"min_doc_count" => 1
	}
	}
	},
	"stored_fields" => ["*"],
	"script_fields" => {},
	"docvalue_fields" => [
	{
	"field" => "timestamp",
	"format" => "date_time"
	}
	],
	"_source" => {
	"excludes" => []
	},
	"query" => {
	"bool" => {
	"must" => [],
	"filter" => [
	{
	"match_all" => {}
	},
	{
	"match_phrase" => {
	"foo_field" => "bar_value"
	}
	},
	{
	"range" => {
	"timestamp" => {
	"gte" => "2023-06-28T12:04:15.943Z",
	"lte" => "2023-09-28T12:04:15.943Z",
	"format" => "strict_date_optional_time"
	}
	}
	}
	],
	"should" => [],
	"must_not" => []
	}
	},
	"highlight" => {
	"pre_tags" => ["\@opensearch-dashboards-highlighted-field\@"],
	"post_tags" => ["\@/opensearch-dashboards-highlighted-field\@"],
	"fields" => {
	"*" => {}
	},
	"fragment_size" => 2147483647
	}
	}
	);

	print Dumper($results);

Jump to

Cheat sheet

Links

Summary

Comparisons

Todo - legacy

Todo2

Normal mode

Exploring HAR files

Streaming mode

Case studies

Youtube

AlternativeTo

Tips, tricks and gotchas

Decode HTML entities

Debugging

Mac

Linux

Windows

Web

Android

iPhone

XML

JSON

Search

Search this blog

Highlights

Quick reference

Links

Blog Archive

Subscribe To

Tags