Ham and Java Sandwich: November 2015

It's amazing what you can do on the command line in a Linux environment - a combination of find, grep, sed, awk and others will do a lot to help with troubleshooting. There's not a day that goes by where I don't have to do something "interesting". There isn't anything you cannot do with a combination of the right commands it seems. Here's some of the ones I use every so often.

NOTE: in my case these all function for Oracle Enterprise Linux v7. Your *nix version (or shell) may require slightly different syntax.

Move or Copy files in place
This will expand to: mv myfilename myfilename.old

mv myfilename{,.old}

Find file larger than 500Mb
You may have to adjust the field numbers. This will print out the size (field #5) and filename (field #9).

find . -type f -size +500000k -exec ls -lh {} \; | awk '{ print $5 ": " $9 }'

Extract lines from the middle of a file

tail -n +startinglinenum filename | head -n numlines

Add a column of numbers
Here, we are adding the total size of deleted files whose handle is still active. In my case, field #8 was the file size.

/usr/sbin/lsof | grep "deleted" | awk '{sum += $8} END {print sum/1024/1024,"Mb"}'

Add total size of all files of a particular type, when you don't have file ending
For example, all PDF files. Our Alfresco doc repository has all binaries named with ".bin", so we cannot simply filter by filename ending.

find . -type f -exec file -in {} \; | grep zip | cut -d ':' -f 1 | xargs ls -l | awk '{ sum += $5 } END { print sum/1024/1024,"Mb" }'

Count docs by mime type

find ./foldername -type f -exec file -inb {} \;| sort |uniq -c|sort -nr

You'll get output something like this:
20 application/pdf
6 application/x-zip
3 text/plain; charset=us-ascii
3 image/png
2 application/octet-stream
1 application/msword

Count docs by file type - same as previous, more human readable

find . -type f | xargs file | cut -d ':' -f 2 | sed 's/^ *//' | sort | uniq -c | sort -nr

Now the output is like this:
13 PDF document, version 1.5
6 Zip archive data, at least v2.0 to extract
5 PDF document, version 1.6
3 PNG image data, 77 x 100, 16-bit/color RGBA, non-interlaced
2 PDF document, version 1.7
2 data
1 Microsoft Office Document
1 ASCII text, with no line terminators
1 ASCII English text, with very long lines
1 ASCII English text, with no line terminators

Finding a class file inside jar files, then ignoring 1 or more folders

for x in $(find . -name "*.jar"); do echo $x; jar tvf $x | grep myclassname; done

for x in $(find . -path ./blah -prune -o -name "*.jar"); do echo $x; jar tvf $x | grep myclassname; done

for x in $(find . $ -path blah -o -path woof $ -prune -o -name "*.jar"); do echo $x; jar tvf $x | grep myclassname; done

Finding the files open for writing by a process
e.g. what log files does a process write to? The 0-9 is the file descriptor, the uw is either updatable or writeable, the REG is regular file. Look for handles 1 and 2, they are stdout and stderr.
Note: you may have to be a privileged user to see some of this information.

lsof -p PID | egrep ' [0-9]+[uw] ' | grep 'REG'

Grepping for something in a file, but extracting the previous line for each
I like to specify to show only interesting columns. In this case, the columns for this particular type of log showed date, user, command, etc. Otherwise there can be a lot of noise that gets in the way.

grep -B 1 -E 'NoClassDefFoundError' diagnostic.log | cut -d ' ' -f 1,2,5,13,14,20

Ham and Java Sandwich

Friday, November 20, 2015

Useful Linux commands