Documenting Problems That Were Difficult To Find The Answer To

Using wget to Automate Logging Into Websites

The open-source wget tool is useful for automating website access/scraping. In particular because it can store/retrieve cookies from a file.

# create a name for the cookie jar/file

# save cookies from homepage access
wget --spider --save-cookies $COOKIE_JAR --keep-session-cookies

# now submit request using saved cookies
wget -O - \
  --load-cookies $COOKIE_JAR \
  --save-cookies $COOKIE_JAR \
  --keep-session-cookies \
  --header "Referer:" \
  --post-data='startlat=1.357348601&startlng=103.9884093&endlat=1.276243657&endlng=103.8545958&routeopt=fastest&start_type=mrt&end_type=mrt&mode=TRANSIT&use_lrt=yes' \

Note that –spider performs a HEAD request and does not download the response. Options useful for debugging and seeing what is sent/received are -d and -S. For cookies the –keep-session-cookies option is essential to save session cookies (with no expiry time set) to the cookie file.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: