Having recently purchased my first eReader, the Kobo Aura, I’ve really started to enjoy reading books more. Instead of having to carry large books about, the small device can sit in my pocket and read books on the go. With the backlight I can even read in bed without the bedside table lamp on.
The only downside is the price of eBooks. Compared to paperbacks at places such as The Works (especially when you take a photo of your receipt and send it to TopCashBack – saving £5 when you spend £20), eBooks can be quite pricey…
There are services out there which will crawl and monitor stores to tell you the prices as these change, such as Luzme, but where’s the fun in that..?
In this post, I’m going to explore crawling the Kobo Store using TaskCentre. Once crawled, any price changes will be updated back to a database and an email to myself sent about the changes.
If a XML Web Service was available, this would make life easy as this could be called directly, but in this case there is not one available, so we need to be creative and download the HTML, extracting the required text.
Firstly we need a database in able to store the data, plus a few books which we wish to keep an eye on.
Here’s the SQL Query to make the table:
CREATE TABLE [dbo].[KoboBooks]( [UniqueID] [int] IDENTITY(1,1) NOT NULL, [Title] [nvarchar](max) NULL, [URL] [nvarchar](max) NULL, [CurrentPrice] [decimal](18, 2) NULL, [PriceLastUpdated] [datetime] NULL, [PreviousPrice] [decimal](18, 2) NULL ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Additionally I’ve added a couple of books to the database which I’m interested in monitoring.
Downloading HTML and Processing
Using a standard OLEDB connection to extract the information from the SQL database, this is passed onto a VBScript step within Taskcentre. This VBScript step uses the WinHttp.WinHttpRequest.5.1 COM Object in order to download the HTML document. Afterwards standard VBScript String Functions are used to extract the required text and stored in a variable.
Additionally, a variable is set to highlight if the price has changed. This is to ensure that only the updated prices are updated back to the database.
In order to update the database, I could write a stored procedure to perform the logic required, but instead I’ve used a lesser used feature in Taskcentre – the memory within the VB Script tool. The memory can be used to call SQL queries and update the data in the database, without writing complex stored procedures. In the screenshot to the left, you can see the mapping, plus also I’ve used the “StoreData” variable to decide whether or not to run the SQL queries to store the data within the database. (See the VB Script in the previous screenshot to see the variable being set.)
Alerting on Price Changes
After storing the data back into the SQL database, this can be extracted via a simple SQL query, dropped into email and sent to myself. To the right is a copy of the data currently stored within the database.
The SQL query to return the data required is:
SELECT KoboBooks.UniqueID AS UniqueID, KoboBooks.Title AS Title, KoboBooks.URL AS URL, KoboBooks.CurrentPrice AS CurrentPrice, KoboBooks.PriceLastUpdated AS PriceLastUpdated, KoboBooks.PreviousPrice AS PreviousPrice FROM KoboBooks WHERE KoboBooks.PriceLastUpdated >= GetDate() - 1
Putting it all together, here’s a screenshot of the full task:
And a copy of the alert triggered:
As with everything, there’s always possible improvements. These could be:
- GUI for managing list of books
- Multiple Stores (WHSmith etc)
- Error handling (If website changes slightly, it will throw error messages)
- Using XPath to extract prices instead of simple string functions.
- Use SQL triggers to send details of price changes, possibly with a copy of the web page embedded within the email