47 Comments
- tracyde, on 10/12/2007, -0/+18You should really try using `info` and the infamous `man` commands, they will tell you all you need to know.
- Chrysalid, on 10/12/2007, -2/+16wget is useful. It's a pity the recursive option gets you banned on some servers :P
- billspaced, on 10/12/2007, -0/+12Yes, you are.
- jvimal, on 10/12/2007, -0/+11But, you have the option of doing a recursive (mirroring) with your Host http header changed, and setting timeouts between successive downloads to say 30 seconds.
- scullder, on 10/12/2007, -0/+11Yes, and it allows you to not use all the available bandwidth of a server.
- Mejogid, on 10/12/2007, -3/+12Wow - spam without even a pretense of making a comment. At least it's honest...
- bigkm, on 10/12/2007, -3/+9i like curl
- JoeSlingo, on 10/12/2007, -0/+6Some other handy wget flags.
resume wget'ing a partial download (tfa mentions this one)
---------------------------------------------------------------
wget -c http://www.foo.tld/foo.bin
keep trying indefinitely if the connection is bad
---------------------------------------------------------------
wget -c -t0 http://www.foo.tld/foo.bin
wget using ftp login and password
---------------------------------------------------------------
wget ftp://login:password@www.foo.tld/pub/foo.bin
*Note that your username and password will be visible in the process listing for the duration of the download which could suck if you are on a multi-user system.
dont overwrite files I've already downloaded (when recursive)
--------------------------------------------------------
wget -r -nc http://www.foo.tld/
save the downloaded file to bigtits.jpg instead of the original name
--------------------------------------------------------
wget http://www.foo.tld/*****.jpg -O bigtits.jpg - simd, on 10/12/2007, -1/+6I use wget to mirror faily to a backup server using FTP on the primary server. As it's incremental, it's not too hard on bandwidth.
It's available for both Windows and Linux and is proving to be very reliable. - gazzerh, on 10/12/2007, -0/+4Installation is quite simple on OS X 10.4. Download and extract: http://www.statusq.org/images/wget.zip
su to the root user (su -) You may need to change he root password if you haven't already. Or just sudo these commands (why /usr/local dir tree isn't in OS X by default is a bit annoying!!!)
mkdir -p /usr/local/bin
mkdir -p /usr/local/etc
mkdir -p /usr/local/man/man1
Copy the files into the locations as per the README.txt file.
Edit /etc/profile and modify so the PATH variable reads something like this:
PATH="/usr/local/bin:/bin:/sbin:/usr/bin:/usr/sbin" - note the inclusion of /usr/local/bin:
Edit /usr/share/misc/man.conf and add another MANPATH:
MANPATH /usr/local/man
You will need to load another terminal for the path environment variable takes effect or just type:
export PATH=$PATH:/usr/local/bin
Should be all done. Type wget and see if you get a response. Type man wget and make sure the man file works. - computerdude33, on 10/12/2007, -1/+5There's also a Mac graphical version, CocoaWget. It's nice, although I'll have to agree with you on the medium-sized downloads.
- szelij, on 10/12/2007, -1/+5Nice, i've sheepishly been wondering what the hell Wgets were all this while...
- BrewmasterC, on 10/12/2007, -0/+3The --random-wait option is great for avoiding automated download detectors.
- 0v3rk1ll, on 10/12/2007, -1/+4A nice front-end for WGET is GWGET.
http://gnome.org/projects/gwget/ - Sotired, on 10/12/2007, -0/+3TV army - here's a hint
rather than attracting attention on the TAL site - which I almost did a while back- wanting to do the same thing. ( they want to sell CD's of TAL - rather than offering downloads. )
You of course, could logically argue that streaming is of course "downloading" , but they feel they are obfuscating the possibility of "downloads" by as you say, hiding the real URL inside the m3u file. a "better" way to go is to find the schedule of the broadcast on your local NPR station and CAPTURE the stream as an mp3 file and set it up as a cron job to run once a week at the specified times.
for example
..........................................................
#!/bin/bash
#
#stream cap for WXXX public radio specifically BBC- then conv to .mp3 file
#sets directory where I want to save the stream
cd /data/music/bbc/
#actually goes out to get the stream- get the URL the same way you got it for TAL.M3U
#I don't need to explain foo.foo.foo, right ?
wget http://foo.foo.foo.net/stream/foo_wXXX &
#allows wget to capture the stream for 61 minutes to allow overlap of the hour cap.
sleep 3660
#stops the wget process
killall wget
#sets a date specific timestamp - which allows you to add it to the end of the file
set $(date)
#copies the no extension file (the capture) to an MP3 file with a date and time stamp on it
mv foo_wXXX bbcwsvce_1hr_$6-$2-$3-$4.mp3
#finishes the script
exit
..................................
#my remarks in there are for you to understand the script
modify it however you wish ie. the sleep time and the saved file name and store directory etc
name that file whatever-u-want.sh
save it in your /home/USER/bin directory
then set a cron job to run once a week at the time you wish it to start - bingo
its like tivo for the radio - and automated
have fun - and again , I would advise staying away from the TAL site as far as big steady downloads go . they don't like it. copyright etc royalties to performers etc.
good luck have fun lay low. - cecil_t, on 10/12/2007, -0/+3curl has more connectivity capabilities than wget (proxies, SSL, cookies, HTTP posts, etc.) but wget can do more with the content itself when downloading web pages - recursive downloads, link following to a specified level, etc., so I wouldn't call wget a "stripped down version of curl".
- piratemonkey, on 10/12/2007, -0/+2Before there was wget there was lynx -source -dump. Now THAT's old school..
- Mejogid, on 10/12/2007, -2/+4I also find gwget usefull - it's a graphical version that intergrates with firefox thanks to an extension. Although I still use wget for larger tasks (thanks to the greater number of options available to you), it's very handy for medium sized downloads, especially if your on an internet connection that keeps dropping as I am.
- Markie1006, on 10/12/2007, -0/+2wget is an awesome tool.
Unfortunately it's also used by a lot of exploits to get nefarious code onto your box before running it (typically perl based IRC servers).
I always rename mine to something less obvious e.g. WGET or _wget etc. - Markie1006, on 10/12/2007, -0/+2Using killall seems like a bit of a blunt stick if you ask me.
It would be much cleaner, and safer to use the $! variable (that is what it is designed for).
kill $!
From TFM
"Expands to the process ID of the most recently executed background (asynchronous) command."
Works in bash and ksh - georgemoore13, on 10/12/2007, -0/+2For those of you who are lazy
http://www.geocities.co.jp/SiliconValley/8916/Macintosh/CocoaWget_en.html
http://www.gnome.org/projects/gwget/ - ajcannon, on 10/12/2007, -3/+5wow, this is perfect timing for me. I've been needing to learn how to use wget for my job and didn't know where to start...thanks so much!
-Andrew - gazzerh, on 10/12/2007, -0/+1ticktoc4: "Wget is not old school to me. Old school 'internet downloading', for me, was assembling N individual uuencoded usenet posts, decoding, and downloading the GIF files at 9600kbps to see porn in 1991."
lol. me too :) - TVarmy, on 10/12/2007, -0/+1I want to automate the download of the This American Life radio show from their website. Each week, they put up a .m3u stream, which is really a link to an MP3 of the show, as can be seen when opening it as a text file. Is there a shell script I could write so that each week it finds the newest files I don't have yet on the server, finds the mp3 link inside them, and then tell wget to download them?
You don't have to write it for me if you don't want to. A few hints to help a linux novice like me would be great. - cobweb, on 10/12/2007, -0/+1You call that simple? I use Fink. Now thaaaat's simple. :)
http://fink.sourceforge.net/
$ fink install wget - macewan, on 10/12/2007, -0/+1worked like a charm, thanks - I missed using wget now that I'm trying OSX
- burke, on 10/12/2007, -0/+1I've said it before and I'll say it again: evolve yourself out of my genepool.
- pixelbeat_, on 10/12/2007, -0/+1I've a section of wget recipies in my command line cheatsheet:
http://www.pixelbeat.org/cmdline.html - duhblow7, on 10/12/2007, -0/+0wget -prk = one of my personal favs
- KWhat, on 10/12/2007, -1/+1agreed, wget is a stripped down version of curl. Although wget is much simpler if you just need to download files.
- rhizome, on 10/12/2007, -1/+1it's a pity that there are lamers with no reading skills who use wget recursively in full-suck mode rather than using the options to slow it down.
- Sotired, on 10/12/2007, -0/+0a couple of things I forgot
when you set the date - it actually sets it at that very second
so if you want to record another hour be sure you do another instance of the set routine
right before the naming of the second file, otherwise you will be naming the second file exactly the same as the first and it will be overwritten.
be careful about your scheduling if another instance of the cap runs concurrently
the killall wget will stop all instances of wget so.... just think a little bit about it.
also... if you have an aborted session for some reason make sure you delete the no extension file ( the incomplete capture) because if it still is in the directory when the script runs again it will append name of the new file .1 etc and the new file won't be renamed , you'll have the data but it will be messy to deal with. - took me a couple of tries to figure that out. it will all make sense once you run it a few times.
:) - ticktock4, on 10/12/2007, -0/+0I always get a chuckle out of people calling things "old school".
Old school is and always will be relative. Wu-Tang is not old school hip-hop to me, Kurtis Blow is.
Wget is not old school to me. Old school 'internet downloading', for me, was assembling N individual uuencoded usenet posts, decoding, and downloading the GIF files at 9600kbps to see porn in 1991.
There's always someone with a faster car, and there's always someone older school.
Give up :) - LKBM, on 10/12/2007, -2/+2I used to use wget in a lot of my Perl scripts. It was easier than using Perl's sockets, and it was a quick and dirty method.
Then, one day, the website changed the log-in slightly (simply changing the cookie domain to .domain.com instead of domain.com, I think), and suddenly wget dind't work. So with some help from my brother, we got this:
system(qq(wget -d 'http://domain.com/login.php' --post-data='username=$username&password=$password' -O /dev/null 2>&1 | grep -o '^Set-Cookie: [^=]+=[^;]+' | head -n 3 | tr 'n' ';' | sed -e 's/^Set-//;s/Set-Cookie://g;s/;$//' > $basedir/cookie/$lun));
I probably should have just gone to Perl's sockets when they changed. I eventually did anyway. - dougmc, on 10/12/2007, -0/+0wget isn't old-school. lynx isn't old school.
Even mosaic isn't old school.
( echo "GET /foo.txt HTTP/1.0" ; echo "" ) | telnet www.foo.com 80 > foo.txt
That's closer to old school, but not truly old school. Old school didn't have this new-fangled http. gopher maybe, ftp ... that's beginning to get into old school :)
uucp ... THAT is old school :) - Grinler, on 10/12/2007, -0/+0LOL..me too. That is until newsbin came out
- catmistake, on 10/12/2007, -0/+0curl, anyone?
- Sotired, on 10/12/2007, -0/+0@markie1006
you are absolutely right , thanks for the heads up on that one. - pothananunkoi, on 12/14/2008, -0/+0why should i get a hackable tool? :D
- rhizome, on 10/12/2007, -2/+1OMGLOL THAT WUD BE AWSUM
- CharlesDarwin, on 10/12/2007, -2/+1lol@lifehacker! what fscking noobs
- swills, on 10/12/2007, -1/+0fetch is a similar utility that comes built into Mac OS X, inherited from FreeBSD. It always bugs me when people install a third party utility for something when there's a similar one already installed...
- gommle, on 10/12/2007, -3/+1I dont like it. I can do the same better with DownTheMall for Firefox.
If it had a GUI for ALL the advanced options i would use it all the time.
(ima make a 4chan /e/ autoDLer.) - inactive, on 10/12/2007, -6/+1yep curl is far superior.
- gamemaster357, on 10/12/2007, -7/+1http://digg.com/gaming_news/Download_Lego_Star_Wars_II_demo
- caddoo, on 10/12/2007, -13/+7Am I the only one who read that title as 'masturbating midget' first !
- inactive, on 10/12/2007, -8/+1wget can't handle multi-parted/threaded downloads.


What is Digg?