Dec 11 2011

Swim Event Times W7P Development -- BeginGetRequest

Category: Administrator @ 07:22

Now the engine that's the heart of this application lies in its ability to scrape a URL.  To do that we need to use the asynchronous calls (BeginGetRequest) out to a URL in the hope that we eventually get a page's worth of data back.  We also need to control the call by means of a timeout. 

So let's talk about the functional setup and what the code has to handle.

To complicate things, the website that I'm hitting is an ASP.Net application which does not use cookies stored on the client.  They're using session state to maintain position within the list of pages that you have requested. The site also does not use the QueryString in any form on the URL, which also makes it a bit stickier since you cannot hit a desired page straight away. You always have to start with the Search page and pump through the rest of the pages.  All of these little things added up to an annoying set of problems. 

From purely a user experience, the site design is poor since they make you re-enter (no cookies) the same information every time you visit the site.   Maybe this is by design but it really does not lend itself to a good customer experience. There's also no mobile support which means that using your phone to hit the site is a real boondoggle.

Now for the code.

 

To kick off any request to a URL you'll need to do it on a separate thread:

public void SendPost()
        {           
            // Create a background thread to run the web request
            Thread t = new Thread(new ThreadStart(SendPostThreadFunc));
            t.Name = "URLRequest_For_" + "TODO";
            t.IsBackground = true;
            t.Start();
        }

 

Next we need to keep the primed Request Stream since we need it on subsequent calls to the site.  So in this case we use BeginGetRequestStream:

void SendPostThreadFunc()
        {

            //test the network first
            if (online == false)
            {
                this.Dispatcher.BeginInvoke(() =>
                {
                    progressBar1.IsLoading = false;
                    MessageBox.Show("Network Disconnected.  Please try again when you have a good Network signal.");
                });
                return;
            }


            // Create the web request object
            try
            {
                HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(CookieColUri);

                //Trying to use the THreadPool for timeout  -- waiting on code
                ThreadPool.QueueUserWorkItem(state =>
                                               {

                                                   webRequest.Method = "POST";
                                                   webRequest.ContentType = "application/x-www-form-urlencoded";
                                                   webRequest.CookieContainer = cookieJar;

                                                   // RequestState is a custom class to pass info
                                                   RequestState reqstate = new RequestState();
                                                   reqstate.Request = webRequest;
                                                   reqstate.Data = "passed data";

                                                   webRequest.BeginGetRequestStream(GetReqeustStreamCallback, reqstate);

                                               }
                                               );


            }


            catch (Exception ex)
            {
                //Debug.WriteLine(" --> BGRS3 Exception: " + ex.Message + ", Thread: " + Thread.CurrentThread.ManagedThreadId); 

                // notify your app of a problem here
                this.Dispatcher.BeginInvoke(() =>
                {
                    progressBar1.IsLoading = false;
                    MessageBox.Show("BGRS3: " + ex.Message);
                });

            }


        }

 Now we handle the callback from the BeginGetRequestStream (lots of error handling):

 

void GetReqeustStreamCallback(IAsyncResult asynchronousResult)
        {

            if (!asynchronousResult.IsCompleted)
                return;

            RequestState reqstate = null;

            try
            {
                // grab the custom state object
                reqstate = (RequestState)asynchronousResult.AsyncState;

                //Thread.Sleep(15000);  // uncomment this line to test the timeout condition

                HttpWebRequest webRequest = (HttpWebRequest)reqstate.Request;                

                // End the stream request operation
                Stream postStream = webRequest.EndGetRequestStream(asynchronousResult);

                // Create the post data
                string postData = "";
                for (int i = 0; i < paramNames.Count; i++)
                {

                    if (paramNames[i] == "POSTDATA")
                    {
                        postData = paramValues[i];
                        break;

                    }
                    else
                    {
                        // Parameter seperator
                        if (i > 0)
                        {
                            postData += "&";
                        }

                        // Parameter data
                        postData += paramNames[i] + "=" + paramValues[i];
                    }
                }
                byte[] byteArray = Encoding.UTF8.GetBytes(postData);

                // Add the post data to the web request
                postStream.Write(byteArray, 0, postData.Length);
                postStream.Close();


                ThreadPool.QueueUserWorkItem(new WaitCallback(target =>
              {
                  try
                  { // you must have this try-catch here to handle exceptions from the callback                      

                      // RequestState is a custom class to pass info
                      RequestState reqstate2 = new RequestState();
                      reqstate2.Request = webRequest;
                      reqstate2.Data = "passed data";
                      reqstate2.AllDone = new AutoResetEvent(false);
                      
                      IAsyncResult result = (IAsyncResult)webRequest.BeginGetResponse(new AsyncCallback(GetResponseCallback), reqstate2);

                      bool waitOneResult = true;

                      if (!reqstate2.AllDone.WaitOne(DefaultTimeout))
                      {
                          waitOneResult = false;

                          if (webRequest != null)

                              webRequest.Abort();
                      }
                      
                  }
                  catch (WebException webExcp)
                  {                   

                      WebExceptionStatus status = webExcp.Status;
                      if (status == WebExceptionStatus.ProtocolError)
                      {
                          // Get HttpWebResponse so that you can check the HTTP status code.
                          HttpWebResponse httpResponse = (HttpWebResponse)webExcp.Response;                   

                          this.Dispatcher.BeginInvoke(() =>
                              {
                                  progressBar1.IsLoading = false;
                                  MessageBox.Show("Unable to reach site. Please try later! " + (int)httpResponse.StatusCode + " - "
                                 + httpResponse.StatusCode + ".");
                              });
                      }
                  }

                  catch (Exception ex)
                  { // you must handle the exception or it will be unhandled and crash your app
                      
                      // notify your app of a problem here
                      this.Dispatcher.BeginInvoke(() =>
                      {
                          progressBar1.IsLoading = false;
                          MessageBox.Show("BGR1: " + ex.Message);
                      });
                  }


              }
                                  ));
            }


            catch (WebException webExcp)
            {               
                WebExceptionStatus status = webExcp.Status;
                if (status == WebExceptionStatus.ProtocolError)
                {
                    // Get HttpWebResponse so that you can check the HTTP status code.
                    HttpWebResponse httpResponse = (HttpWebResponse)webExcp.Response;                    

                    this.Dispatcher.BeginInvoke(() =>
                        {
                            progressBar1.IsLoading = false;
                            MessageBox.Show("Unable to reach site." + (int)httpResponse.StatusCode + " - "
                           + httpResponse.StatusCode + ".");
                        });
                }

            }
            catch (Exception ex)
            {
                // notify your app of a problem here
                this.Dispatcher.BeginInvoke(() =>
                    {
                        progressBar1.IsLoading = false;
                        MessageBox.Show("BGR3: " + ex.Message);
                    });             
            }            

        }

 Now get the response from the website:

void GetResponseCallback(IAsyncResult asynchronousResult)
        {

            if (!asynchronousResult.IsCompleted)
                return;

            // grab the custom state object
            RequestState reqstate = (RequestState)asynchronousResult.AsyncState;

            //Thread.Sleep(50000);  // uncomment this line to test the timeout condition 50 seconds (timeout 45)

            try
            {
                HttpWebRequest webRequest = (HttpWebRequest)reqstate.Request;                

                // End the get response operation
                HttpWebResponse response = (HttpWebResponse)webRequest.EndGetResponse(asynchronousResult);               

                Stream streamResponse = response.GetResponseStream();
                StreamReader streamReader = new StreamReader(streamResponse);
                Response = streamReader.ReadToEnd();
                streamResponse.Close();
                streamReader.Close();
                response.Close();

                // Call the response callback
                if (callback != null)
                {
                    callback();
                }

            }
            catch (WebException webExcp)
            {
                // If you reach this point, an exception has been caught.           
                WebExceptionStatus status = webExcp.Status;
                if (status == WebExceptionStatus.ProtocolError)
                {
                    // Get HttpWebResponse so that you can check the HTTP status code.
                    HttpWebResponse httpResponse = (HttpWebResponse)webExcp.Response;                 

                    this.Dispatcher.BeginInvoke(() =>
                        {
                            progressBar1.IsLoading = false;
                            MessageBox.Show("Unable to reach site." + (int)httpResponse.StatusCode + " - "
                           + httpResponse.StatusCode + ". Launching browser directly at site to show error!");

                        });

                    //Launcher for main page in which we got the error.
                    WebBrowserTask webBrowserTask = new WebBrowserTask();
                    webBrowserTask.URL = CookieColUri.ToString();
                    webBrowserTask.Show();

                    return;
                }
                else
                {
                    if (status == WebExceptionStatus.RequestCanceled)
                    { //abort from time -out                        
                        this.Dispatcher.BeginInvoke(() =>
                            {
                                progressBar1.IsLoading = false;
                                MessageBox.Show("Network Connection lost.  Please try when you have a good Network signal.");
                            });
                        return;
                    }
                    else
                    {
                        this.Dispatcher.BeginInvoke(() =>
                        {
                            progressBar1.IsLoading = false;
                            MessageBox.Show("Request lost.  Please try when you have a good Network signal.");
                        });
                        return;

                    }

                }
            }
            catch (Exception excp)
            {
                this.Dispatcher.BeginInvoke(() =>
                {
                    progressBar1.IsLoading = false;
                    MessageBox.Show("Request for Swimmer lost.  Please try when you have a good Network signal.");
                });
                return;
            }


            reqstate.AllDone.Set();

        } 

 

Now go ahead and use the HTML Agility Pack on the return results (in the callback) to strip any data you want from that page.

Note:  You must use the timeout on these calls otherwise you'll have a zillion crashes in your app. 

The timeout code was provided by Dan Colasanti.  www.improvisoft.com/blog (I owe him many beers!)

 

Tags: , , , , ,