Simple Web Scraper
https://sourceforge.net/projects/iad-dispatch-web-scraper/
Goal:
Write a very, very, very (did I mention very?) simple program to pull data from a simple website and plot the information on a graph. Additional work can be done to build a small database and do analytics on the data.Introduction:
The Dulles International Airport (IAD) near Washington, D.C. has a taxi service provided by the Washington Flyer. Taxi cabs are leased by drivers and rides are regulated using a queue system. Drivers enter a corral near the Arrival gate and wait for dispatchers to announce passengers.
Motivation:
The program should attempt to answer these "ten questions". (Ten being a fluid number, that can range from one to however many I want. Ten is whatever number I say it is.)- What is the time of day that has the shortest expected wait time?
- What is my expected wait time when I enter the corral?
- Are there shorter wait times on certain days?
- How many rides can I get between the hours of 11pm and 5am?
- How many hours should I expect to work to average five rides a day?
- ...
- Ten
Steps:
- find out how to connect to a website and download html
- find out how to auto-refresh, periodically get data
- learn how to parse html for relevant data: departure times, numbers of cars
- generate plots
- create user interface to display info
Tools:
Python, BeautifulSoup library
To do:
- Plot the data points by date and time according to this format "11/30/2015 14:00:52"
- Figure out which plots to show the data.
- Holding Lot Count vs date and time of dispatch update
- Wait time of last dispatched vs date and time of dispatch (entry time + wait time)
No comments:
Post a Comment
You can add Images, Colored Text and more to your comment.
See instructions at http://macrolayer.blogspot.com..