Sep 2 2011

W7P SwimEventTimes development - Fiddler2

Category: Features � Administrator @ 11:19

Back to the beginning.  When you need content and the content exists on other web sites you'll have to devise a way to acquire the data.  And I'm not talking about OData or some other nice structure of data lying around on the internet.  What I'm talking about is, for a lack of better term, "SCRAPING" data from other URL's.  Some would call this Web Scraping but this technique really applies to a whole realm of techniques and not just in use on the web.

Some believe that type of development is fraught with problems since your tying your app to a URL and that web design.  Just know these pitfalls up front and try to mitigate as many problems as possible.

Try and contact the URL's owners and try to get them to come onboard with your development.  That way at least you'll have knowledge of upcoming changes which could cause side-effects on your W7P app.

Disclaimer:  I'm a developer and not a copyright lawyer but be warned that some sites believe that they own the data and have the sole rights to that data.  It's the old "Creative" vs "Collection" argument. Collection of data is not creative but do your own copyright research.

Now let's talk about overall design.  Some designs call for an intermediate server to make the calls to the target URL but I opted to make the Windows 7 Phone software smart enough to exist on its own and pull the data directly from the target.  This minimizes the points of failure. It's either working from the W7P or not plus it also permits the phone to operate anonymously from the target URL's perspective.

Now the main tool that I quickly made use of is Fiddler2.  Fiddler2 allows you to look at the payload when making calls across the web.  This includes everything that is sent from your Request and the subsequent Response.

The site from which the data for this app will be scraped is an ASP.NET site.  I can't be sure but the site appears to make use of a Content Management System. I could have gone further into looking at the CMS since I believe that it's Open Source and pulling apart the code but I decided to spend the time on how and if I could get access to the data across the web from the phone. 

Other sites and their software platforms and how they implement cookies/session data etc. will change the makeup of how you access the pages but you need to start with Fiddler2.

I know IE9/Firefox have similar developer tools but when this application started up, Fiddler2 was my choice.

Get to know Fiddler2 and exactly what pages you need to hit.  Make screen shots of the flow that you'll need to capture and test scenarios that will cover the flow of pages.

You and Fiddler2 will spend many long nights together.

Tags: , ,