Collecting log files with `scidbctl.py collect-diags`
The scidbctl.py script can collect SciDB log files and other diagnostic information into a tar(1) archive that can be sent to Paradigm4 for analysis. Use the -h option to see the syntax for this subcommand:
[$ scidbctl.py collect-diags -h
usage: scidbctl.py collect-diags [-h] [-l] [cluster]
positional arguments:
cluster SciDB cluster name. Must name a section in the config.ini file
(see -c/--config option). If not specified, use SCIDB_NAME
environment variable if set, else use the first cluster in
config.ini.
optional arguments:
-h, --help show this help message and exit
-l, --light Skip large objects such as binaries and core files
$
If you are running SciDB as a service, you must also specify the location of the SciDB configuration file, for example scidbctl.py --config /opt/scidb/23.10/service/config-0-mydb collect-diags
.
The collect-diags
subcommand collects the following information:
All
scidb.log*
files from all instances. At present there is no ability to select a time range.Contents of the
etc
andshare
subdirectories under/opt/scidb/23.10
.A “system report” that includes output from the following commands:
"sysctl -a", # Kernel parameters "ip a", # Interfaces and addresses "netstat -i", # NIC statistics "netstat -r -n", # Routes "arp -an", # ARP cache "vmstat -s", # Memory statistics "vmstat -a", # Active/inactive memory "sudo vmstat -m", # Slab stats "vmstat -d -w", # Disk usage "dmesg", # Kernel ring buffer
MD5 checksums of all files in the base installation directory, for example
/opt/scidb/23.10
.Stack traces of all currently running SciDB instances (if Gdb is installed and gstack(1) is available).
Stack traces of any
core*
files found in the instance data directories.Unless the
-l/--light
option is specified, thecore*
files themselves will also be collected.
Current best practice is to configure creation of SciDB core dumps in the /var/crash/scidb
directory. scidbctl.py collect-diags
will not find these core files.
A sample collect-diags
run on a very small cluster looks like this:
$ scidbctl.py collect-diags
[scidbctl] Collecting diagnostics at 2024-03-22T220505
[scidbctl-0-0-mydb] Producing diagnostics...
[scidbctl-0-1-mydb] Producing diagnostics...
[scidbctl-0-1-mydb] Diagnostics generated in <datadir>/diags/2024-03-22T220505
[scidbctl-0-0-mydb] Tracing stack for running SciDB pid 215092 ...
[scidbctl-0-0-mydb] Tracing stack for running SciDB pid 215093 ...
[scidbctl-0-0-mydb] Diagnostics generated in <datadir>/diags/2024-03-22T220505
[scidbctl] Gathering collected diagnostics
[scidbctl] Diagnostics in /data/scidb/0/0/diags/all-2024-03-22T220505.tar
$
Each individual instance’s diagnostics are placed into a compressed tar archive, and those are placed in turn into an uncompressed all-cluster archive (since running compression twice can actually expand data):