You are not logged in or registered. Please login or register to use the full functionality of this board...
SIGN IN Join Our Community For FREE


another friendly bash one-liner example
05-28-2017, 11:23 AM
Post: #1
 (Print Post)
another friendly bash one-liner example
so heres a fun task this morning.

i want to find text files in many folders. i also want to find trends in the text that i can grep for, so i can look for large groups of semi-related things. theres a science here and im not getting into that, this is about efficiency and making a task manageable.

so the first thing to do is find .txt files. it can be done using finds parameters, and i never bother because im already piping to grep. yeah, yeah, yeah:

find | grep "\.txt"


in this listing we want to get rid of dots, slashes, dashes and underscores. we could use sed "s/[.\/_-]/\ /g" but since this is meant to be a friendly intro, lets use tr to replace one character at a time:

find | grep "\.txt" | tr '.' ' ' | tr '/' ' ' | tr '_' ' ' | tr '-' ' '


now we have mostly characters and numbers, exactly what we want, but the strings are still on the same line. lets change all spaces to newlines, then put each token in alphabetical order:

find | grep "\.txt" | tr '.' ' ' | tr '/' ' ' | tr '_' ' ' | tr '-' ' ' | tr ' ' '\n' | sort


uniq -c to count the repeats:

find | grep "\.txt" | tr '.' ' ' | tr '/' ' ' | tr '_' ' ' | tr '-' ' ' | tr ' ' '\n' | sort | uniq -c


and sort -n (thats numeric sort, otherwise it would group 1 with 10 and 100 before 2) to put the most common repeats at the bottom (sort -rn will put them at the top)

find | grep "\.txt" | tr '.' ' ' | tr '/' ' ' | tr '_' ' ' | tr '-' ' ' | tr ' ' '\n' | sort | uniq -c | sort -n # public domain


already its done. one find command, one (superfluous) grep command, 5 tr commands, uniq and 2 sorts-- 10 easy commands on one line, and we have found all the most common tokens inside a recursive file listing.

so now if we are looking for large groups of files by strings that are common among them, we know what tokens (whether as part of the filename, or as part of a folder that has the most of that type of file) are most common. incidentally, the second most common token this time was "txt" (of course.) so we know that one isnt useful.
Find all posts by this user
Like Post



Forum Jump:


User(s) browsing this thread: 1 Guest(s)




QB64 Member Project - ARB Checkers
QB64 Member Project - Rubix's Magic
QB64 Member Project - Othello
QB64 Member Project - Bowditch curve
QB64 Member Project - Line Thickness
QB64 Member Project - Qubic
QB64 Member Project - Point Blank
QB64 Member Project - Connect Four
QB64 Member Project - Kings Vallery version two
QB64 Member Project - Isolation
QB64 Member Project - Basic Dithering
QB64 Member Project - Rotating Background
QB64 Member Project - Touche
QB64 Member Project - Martin Fractals version three
QB64 Member Project - Score 4
QB64 Member Project - Sabotage
QB64 Member Project - Dreamy Clock
QB64 Member Project - Dakapo
QB64 Member Project - Inside Moves
QB64 Member Project - Pivet version one
QB64 Member Project - Algeria Weather
QB64 Member Project - Swirl
QB64 Member Project - Input
QB64 Member Project - OpenGL Triangles
QB64 Member Project - Pivot version two
QB64 Member Project - 9 Board
QB64 Member Project - Martin Fractals version one
QB64 Member Project - Martin Fractals version two
QB64 Member Project - Red Scrolling LED Sign
QB64 Member Project - Amazon
QB64 Member Project - Kobolts Monopoly
QB64 Member Project - Domain
QB64 Member Project - Kings Valley verion one
QB64 Member Project - Kings Court
QB64 Member Project - Quarto
QB64 Member Project - Martin Fractals version four
QB64 Member Project - Spinning Color Wheel
QB64 Member Project - Full Color LED Sign
QB64 Member Project - Color Triangles
QB64 Member Project - Blokus
QB64 Member Project - Exit
QB64 Member Project - Splatter
QB64 Member Project - RGB Color Wheel
QB64 Member Project - Color Rotating Text
QB64 Member Project - Foursight
QB64 Member Project - Spiro Roses
QB64 Member Project - STxAxTIC 3D World
QB64 Member Project - Overboard
QB64 Member Project - MAPTRIANGLE