45 Comments
- 3monkeys, on 10/12/2007, -4/+14Cool on of the admins fixed my typo in the title.
- silverfire, on 10/12/2007, -0/+9You don't have to be on Linux to use them :P
http://unxutils.sourceforge.net/ - bribera, on 10/12/2007, -2/+11I grep around awkwardly alot, but I still manage to get some head. And tail.
- lwdallas, on 10/12/2007, -1/+10Get crazy, baby:
http://www.cygwin.com/ - zoombusa, on 10/12/2007, -0/+8what about sed?
- oceanbeachsf, on 10/12/2007, -0/+7Great utils. I use em all the time. I think it should be noted these utilities existed for at least 20 years before Linux. The kids in Berkely were doing this years ago!!
- merreborn, on 10/12/2007, -0/+7and xargs.
- flash200, on 10/12/2007, -1/+7sed can be pretty useful, for applying regular expressions to a stream of text data, or for applying a regular expression to a variable in a bash script.
Haven't used awk yet but it looks very interesting. - aeio, on 10/12/2007, -1/+7don't forget sort, uniq and tac (cat, but backwards)
- Bishoco, on 10/12/2007, -0/+5Anybody who wants to become well versed in UNIX needs to learn these tools. They are powerful and often very fast on even large files. Sed, grep, head, and tail get really powerful when you learn how to chain the commands together through piping. Awk is a really powerful tool as well. These suckers can save you a lot time if you need pull data out of files or convert it to a new format.
If you want to learn this stuff, I really recommend O'Reilly's Unix in a Nutshell. http://www.oreilly.com/catalog/unixnut4/index.html
And have a seasoned UNIX guy show you the ropes. - merreborn, on 10/12/2007, -1/+6For many tasks, it's generally a lot faster to just pipe a couple of commands together on the commandline than it would be to write a perl script.
- gtchen66, on 10/12/2007, -0/+4These are all basic utilities that are part of the preliminary questions I ask any job canidate who claims to work with unix. They are extremely powerful individually, but become more so when combined. One thing the article doesn't do justice to is how to combine multiple utilities via pipes. No GUI in the world has the flexibility to do the kind of ad-hoc queries I routinely execute via some combination of these textutils.
The recent AOL fiasco, for example, precedes each line with the userid. Lets say I wanted to know which user had the most searches during the 3 months that the data encompassed. My command would be:
cut -f1 *.txt | sort | uniq -c | sort -n | tail
I could refine it and set a variable to the final value with (depends on your shell):
set most = `cat user-ct-test-collection-??.txt | grep -v ^Anon | cut -f1 | sort | uniq -c | sort -rn | head -1 | awk '{print $2}'`
I'd love to see any GUI try that. - midlifecrisis, on 10/12/2007, -2/+6Grep is most useful when combined with find.
- Sheco, on 10/12/2007, -0/+4The article tells you a way to search for a word and you don't care about the case of the first letter and it is:
# grep "[T, t]he" memo
WTF!!! that means match when there is a T or a comma or a space or a t, followed by 'he'. It does not mean match when there is a T or a t, then 'he'
You should do this instead, when wanting to match for different characters in a position.
# grep "[Tt]he" memo - diecastbeatdown, on 10/12/2007, -0/+4daily tools of the trade. if you need info now, especially on the one-liner front then you need to know these apps. i find awk to be typically more useful than grep.
one of the most useful things about grep is the ability to use regex and resursively search. let's say you have 10 sub directories and you want to search all the files in there as well:
grep -rie '^c' /some/dir
This will search for any line that begins with the letter 'c' (capitol or not since I used -i) in every file in every subdirectory in /some/dir. the -e is for regex to interprete the carrot '^' which signifies the beginning of a line. instead of -e you can also use egrep if your system has that link/app setup.
i found the article to be far too basic, and it really skimps on awk which is definitely more powerful than grep. here are some nice texts that help me out with sed and awk from time to time when my memory fails:
http://www.linuxjournal.com/article/1224
http://www.student.northpark.edu/pemente/awk/awk1line.txt - gruvsf, on 10/12/2007, -0/+4I'm not Linux, I'm BSD!......GAWD!
http://youtube.com/watch?v=gFAJDbV9Vfs - nofxjunkee, on 10/12/2007, -1/+4zsh + extended globbing pretty much obsoletes find.
http://zsh.dotsrc.org/Doc/Release/zsh_13.html#SEC62 - diecastbeatdown, on 10/12/2007, -0/+3perhaps the person has never used linux before? many of the livecd ditros now invite people to simply use it as an oridnay OS. They are not even introducing users to the commandline anymore but giving graphical utilities for practically everything. There are even graphical Regex frontends now which you can setup scripts with and use with checkboxes and such. For new *nix users these days it is a different world.
- CypherXero, on 10/12/2007, -1/+4Well, I know from using both perl and these programs, that using *nix utilities like awk and sed are much easier than trying to put together a perl script with the correct modules to accomplish the same task.
- kracker, on 10/12/2007, -0/+2Harsha,
I just don't hear much respect for the software you speak on from the article you have published.
You go to great lengths in the beginning to avoid describing GNU, the GNU system,
the FSF who created and actively develop the excellent software you are showing other users how to use.
Why not give some credit to the GNU project for all the years of excellent work.
Linux is a kernel, your talking about GNU/Linux Operating System, specifically
speaking about the core software utilities which are the basis of any GNU system
which the Linux kernel just happens to sit at top at this moment of functionality and popularity.
There are many kernels for the many different types of Operating Systems which use the GNU System.
The GNU system is found as the basis for a rather large percentage of the installed
software which many different distributions consist.
This article has little to do with Linux and even less with the old world,
feature weak unix counterparts which were the basis for must of any unix-like operating system.
You are taking about GNU Free Software programs, not Linux (a kernel).
Your incorrectly describing these FSF/GNU Free Software programs as
Linux and Open Source, clearly using a proven inaccurate description
of the GNU System's core tookit and history of each of these powerful
software utilities. This kind of glassing over facts, which is not
something I would have expected from IBM.
It would be better to be clear and factual with the details of who
makes this software. Your readers may wish to get more information
on the software you name/describe incorrectly.
If your readers were to know in the first paragraph that you were talking
about the tools found in the GNU system, GNU/Linux or even from the GNU Toolbox
(at a least) which is something IBM already publicly provides a ppc/AIX at:
http://www-03.ibm.com/servers/aix/products/aixos/linux/index.html
http://www-03.ibm.com/servers/aix/products/aixos/linux/download.html
It just seems like it would be more useful to the readers of the article
to know where to look and find out further specific details about the
software programs to which the article is about.
For example if you were trying to use these GNU commands on a system
which did not have GNU installed on another Unix-like operating system
like Solaris, AIX, FreeBSD these commands (if they were available) would
have a reduced feature set and different set of program arguments.
There is a difference between a distribution, the maintainers, and developers of software
The real title of your article should have been, "Simplify data extraction using GNU text utilities",
and if your management insists on using Linux for name recognition despite the content of the article,
use "Simplify data extraction using GNU/Linux text utilities" and add a few GNU references.
References:
- GNU Project
http://gnu.org/
- GNU text utilities : Toolbox introduction
http://www.gnu.org/software/textutils/manual/textutils/html_node/textutils_46.html
- GNU Grep Manual
http://www.gnu.org/software/grep/doc/
- GNU Core Utilities Manual
http://www.gnu.org/software/coreutils/manual/
- GNU Awk Manual
http://www.gnu.org/software/gawk/manual/
- GNU Manuals
http://www.gnu.org/manual/manual.html
- Wikipedia: GNU
http://en.wikipedia.org/wiki/GNU
- GNU Software on ibm.com
http://www-128.ibm.com/developerworks/views/aix/downloads.jsp
- AIX Toolbox
http://www-03.ibm.com/servers/aix/products/aixos/linux/index.html
http://www-03.ibm.com/servers/aix/products/aixos/linux/download.html
- Google Search : GNU System
http://www.google.com/search?q=GNU+toolset&start=0&ie=utf-8&oe=utf-8&client=firefox-a&rls=org.mozilla:en-US:official
Respectfully,
//kracker - bobmontana, on 10/12/2007, -1/+3I've been picking up perl, and while I'm only in the beginning stages of learning the language, it seems as if can do a majority if not all of what the utilities on this page can do.
Can anyone explain the advantages/disadvantages of perl vs awk/grep/sed/find etc? - bobmontana, on 10/12/2007, -0/+2A quick google led me to this page. I know it does seem somewhat Perl-biased, but still is an interesting comparison.
http://pubcrawler.org/perl-rocks.html - NTolerance, on 10/12/2007, -0/+2And while you're at it get the best damn Windows terminal program integrated with Cygwin:
http://gecko.gc.maricopa.edu/~medgar/puttycyg/ - jo42, on 10/12/2007, -1/+3This is just so lame...
Like doing an article for carpenters on hammers and nails. - flash200, on 10/12/2007, -0/+2If you can reasonably accomplish a task at the command line using the above utilities, then that's probably the best way to go. If the task is complex or diverse enough to require more of a full program, then write a script in Perl, Python, Ruby, or PHP.
There's some middle ground, where it's helpful to create a bash script that uses the above utilities. That saves you from having to re-type the commands each time, and you can tackle problems that are a little more complex.
Much of *nix programming seems to be about stringing together a series of existing utilities, with each utility playing a very specific purpose, rather than writing large utilities that can solve a wide variety of problems. - iamfromthenorth, on 10/12/2007, -1/+2While not official Linux distro code, but there's also:
http://www.antair.com/andrey/code/col2row.html - citrusfizz, on 10/12/2007, -6/+7that should be one not on ;-P
- diecastbeatdown, on 10/12/2007, -0/+1or: grep -i the memo
if you are looking for a spcific character regardless of case, in this case 't or T' then you would not really need to specify it in brackets. now if you did not care about the character preceeding 'he' as in 'she' or' the' and 'they' you would then do this:
egrep '^.he' memo
but for only three letter words do this:
egrep '^.he$' memo
i should write my own article and get puclished on one of these sites. bah! ;P
of course the above assumes a few things which i wont get into here. maybe 'll write that article after all. - diecastbeatdown, on 10/12/2007, -2/+3'you know those' i highlighted and accidentally deleted some letters. ;)
oh well, that's what I get for being so critical of another persons comment. - jassim, on 10/12/2007, -0/+1cool but need more like "xargs" and "sed" and "sort" very important softwares
- soapboy, on 10/12/2007, -0/+1Take an intro Linux shell scripting class at any community college or university, and you will get more training on this than you probably could ever want.
- dineshbabu, on 10/12/2007, -0/+1jo42 , I couldn't agree with you more! There is no surprise in Linux having all these tools, it is *nix based and if you knew *nix tools, you know this already!
- dineshbabu, on 10/12/2007, -0/+1Is it a surprise that Linux has all these utilities ?
- mapkinase, on 10/12/2007, -2/+2Do not listen. Inline PERL scripting is easy, powerful and comprehensive, and you do not have to learn other routines.
Nicely grep has -P option that allows it to behave like perl -ne 'print if /$expr/'. - nanoage, on 10/12/2007, -1/+1Yes, and can you ask those kids why there is a More command for me. Less kinda makes More obsolete.
- sik0fewl, on 10/12/2007, -1/+1alias more=less # more is less (more or less)
- schappsj, on 10/12/2007, -2/+1*agree with Whitey04* This is one of the most lame posts I've seen on digg. How can you use Linux and not know about grep, pipe, etc?!
- diecastbeatdown, on 10/12/2007, -4/+2@midlifecrisis
you knose english quizes in preschool where they say things like 'apples are much better when eaten with milk' and they ask 'is this a fact or an opinion?'
you guessed it. - midlifecrisis, on 10/12/2007, -4/+1@diesoondeadbeat
"knose"? How long that you left preschool? Three days? - frogpelt, on 10/12/2007, -5/+2Was that supposed to be a play on "Lions and Tigers and Bears..." ?
Ok, then. - vetipc, on 10/12/2007, -5/+1Come on, windows is so much easier... click click ;)
- 3monkeys, on 10/12/2007, -10/+4That should be "Linux" not "Liux". My bad.
- Whitey04, on 10/12/2007, -11/+3Marked as lame.
This isn't really news or anything new. If you would like to find out about things like this anybody who can find digg can find google.
Cheers.


What is Digg?
The Digg Toolbar for Firefox lets you Digg, submit content, and keep track of Digg even when you're not on the Digg site. Download the official