Apr 11 2017

Switch to read JSON vs raw HTML

Category: C#Administrator @ 18:00

I've maintained a Windows Phone application since the early days of Windows Phone 7 and it has served well for over six years.  If you read my earlier posts you'll see that the application was based upon a targets' website raw html pages.  The basis of the application was to pull the data from out  (scrape) of the resulting web pages and give a better experience for phone users.  I did this since the website was a complete disaster when trying to use it from the desktop and a failure when trying it on any smaller form factor like a phone.

Well, as expected, the website has finally re-engineered the set of pages that were used to scrape data from used in the phone application.  So taking a look at the re-engineered data I see that the website has taken a path to a more modern, reactive framework.  Lucky for me it looks like they have gone the route of delivering the data in one JSON payload.

So for me, I'm going to redo my code to pull out the JSON data and blow it into my observable collection. 

I need to setup a new Windows 10 64-bit box and all the other bits to get back on track like Fiddler/VS2012/Source Repository and the Phone SDK.

I'll detail what things I find when re-doing to read the JSON data rather than the raw HTML pages.

Should be fun.

Tags:

Dec 27 2015

Song List from CustomsForge in Excel format

Category: Administrator @ 20:05

This post came about since I've been trying to make better use of a guitar and Rocksmith 2014.

CustomsForge is the place for songs outside the paid downloads path and I've wanted to be able to search and hold onto a list of songs that I might like or eventually play.

So another use for Fiddler and dealing with data from sites that have no API would be to pull your own list using GetWebRequest.

Having done this previously against other sites I thought it might prove helpful to others who want a local Excel or printed copy of the list as of the beginning of December 2015.

Here' the link:

http://1drv.ms/1OUC9Id

Enjoy!

 

 

 

 

 

 

 

Tags: , , ,

Nov 22 2015

Importing Excel/CSV to SQL Server -- OpenDataSource

Category: Features | SQL ServerAdministrator @ 07:49

 

While there are many approaches which come and go for importing Excel and CSV data files into SQL Server nothing is as easy as using OpenDataSource.  I know there are many who use SSIS and its brethren but some things when dealing Excel combined with SSIS, like nvarchar, costs time and diminishes its ease of use.  So let's have a quick go of importing data, which I do on a local version of SQL running on my workstation.

Here's what my sample data looks like in Excel:

EmailAddress FirstName LastName
scottg@contoso.com Scott Guthrie
steves@contoso.com Steve Sanderson
aarons@contoso.com Aaron Skonnard
fritzo@contoso.com Fritz Onion
johns@contoso.com John Smith
scotth@contoso.com Scott Hunter
madsk@contoso.com Mads Kristensen
howardd@contoso.com Howard Dierking
elijahm@contoso.com Elijah Manor
estebang@contoso.com Esteban Garcia
shawnw@contoso.com Shawn Wildermuth
peteb@contoso.com Pete Brown
reisenberg@contoso.com Rob Eisenberg
timh@contoso.com Tim Heuer

 

So open a query window in SSMS and fire off the following query:

SELECT distinct * --[First Name], [Last Name], [EmailAddress]
FROM OPENDATASOURCE('Microsoft.Jet.OLEDB.4.0',
'Data Source=C:\MSSQL\clients\INTLdist\Queries\LoadEandF\SampleEmailList.xls;Extended Properties=Excel 8.0')...[Sheet1$]
where [FirstName] is not null

You'll see I'm using the provider for Excel's (97-2003) older version that are still in xls binary format. Also note the location of the Data Source which is local to the SQL Server you are running.

While you'll see that this as a rather simplistic sample, imagine an Excel file with 20 or greater columns and you'll begin to appreciate the ability to see column names and the ability to get right to the data without having to use some wizard to move "from/to" a destination.

So you might begin to think of things you could use this for.

Say someone lands a huge (>10GB) text file (SampleHugeFile.csv) on your network and you have to take a look inside to see what was delivered.  You're not going to use notepad or any other desktop client tool since most will eat your memory alive.  Instead just use the following and you'll see what's in the file without incurring the memory penalty:

SELECT  top 10 WorkRef, WorkKey
FROM OPENDATASOURCE('Microsoft.Jet.OLEDB.4.0',
'Data Source=C:\MSSQL\clients\INTLdist\Queries\LoadEandF\;Extended Properties="Text;HDR=YES"')...SampleHugeList#csv

I know some may see this as extreme but if you only have a hammer then everything starts to look like a nail.

Now for the final piece, since most of the world is using later versions of Excel you'll need slightly different OPENDATASOURCE provider notation:

select top 10 *
FROM OPENDATASOURCE('Microsoft.ACE.OLEDB.12.0',
'Data Source=C:\MSSQL\clients\INTLdist\Queries\LoadEandF\SampleEmailListInLaterVersion.xlsx;Extended Properties=Excel 12.0')...[Sheet1$]

 Note the change in providers when using the OPENDATASOURCE since you'll need to switch to this when using newer versions of Excel which support the xlsx file format.

Now start to mix it up and start using a CTE to manipulate the data like so:

;with CTE as 
(
SELECT distinct * --[First Name], [Last Name], [EmailAddress]
FROM OPENDATASOURCE('Microsoft.Jet.OLEDB.4.0',
'Data Source=C:\MSSQL\clients\INTLdist\Queries\LoadEandF\SampleEmailList.xls;Extended Properties=Excel 8.0')...[Sheet1$]
where [FirstName] is not null

)
	SELECT  
	   PARSENAME(REPLACE(EmailAddress,'@','.'),3) Email, 
       PARSENAME(REPLACE(EmailAddress,'@','.'),2) Domain, 
       PARSENAME(REPLACE(EmailAddress,'@','.'),1) DomainExt 
FROM CTE
 

results in:

Email Domain DomainExt
Anderson769 contoso com
Hayes231 contoso com
Russell637 contoso com
Simmons791 contoso com
Young91 contoso com
aarons contoso com
Gonzales286 contoso com

 

 SQL Server has many hidden gems that you need to see just once and then you'll understand how it can increase your productivity.



Tags: , ,

Dec 14 2014

IIS asynchronous processing when using Session

Category: Administrator @ 11:33

First some background.  This all came about since we have a new developer using all the latest Javascript front-end coding tools/practices and he is developing using Apache locally since he is not a Microsoft guy.  So until we could get him into Visual Studio he had gotten along using Apache to develop and even demo the application.

Low and behold, his development workstation running Apache is more than double the speed of IIS on a Windows 2008R2 server. He even has more network hops to get the data he requires than IIS since all the resources for IIS are located on the same subnet.

I couldn't figure out how and why Apache was working so much faster.

Here's the how the application works.  The Javascript front-end makes seven asynchronous calls to the web service to bring back all the data for the home page.  Now that we're moving to a stable IIS configured development platform I ported his code to C# and expected that the IIS calls would run the seven asynchronous calls just like Apache.  Well it doesn't

I had run the gamut on this – from IIS worker threads to all sorts of machine configuration and web configuration changes.  None of that works.

Finally the Network Guy to put a Wire Shark on and it revealed that Apache was opening up seven ports to complete the calls to the web service and IIS was only opening one port to communicate with the web service. Until the network guy revealed that this was happening I was completely stumped.

Basically IIS ended up blocking the results from returning since they are all on the same port.  IIS thinks that you’re intending to mess with the Session variable so IIS prevents a race condition.

All you have to do is put a page declarative in to make the Session State Read Only.  We do this on a subsequent (data getter) page after the initial Login page so only the Login uses/stores the session variable. 

All other pages just interrogate the variable and this permits IIS to open as many ports as necessary to satisfy the asynchronous calls.  You’ll probably never ever hit this but it’s good to know that IIS still can do its job.

Here's where I found the answer (thank goodness since managers were ready to jettison IIS):
 

http://johnculviner.com/asp-net-concurrent-ajax-requests-and-session-state-blocking/

 

 

Tags: , , ,

Jun 28 2014

Comparing String Values for Similarities

Category: C# | SQL ServerAdministrator @ 12:41

Many moons ago I embarked on a proof-of-concept project to see if I could use SQL Server to perform a matching process.

It was successful but I still had a lingering suspicion that the underlying algorithm (Levenshtein) used to determine the sameness between human full names was less than optimal.  So what ensued was a period where I would look around the web, once in a while, to see if I missed something. 

I can't tell you how many algorithms I researched, converted to C# and tested.  Don't ask me why, but anyone who starts with Levenshtein will most likely never find this algorithm since so many others like Levenshtein vie for attention.

I finally found what I was looking for purely by accident:

Strike-A-Match: http://www.catalysoft.com/articles/StrikeAMatch.html

and kudos to the paste bin for the C# version: http://pastebin.com/EfcmR3Xx#

So now if you ever need to compare human full names whether they be in any order like:

  • last name, first name compared to first name, last name

Strike-A-Match will do the trick. Take a look at this comparison of a human name: "Jimi Hendrix" to "Hendrix Jimi"

Using Strike-A-Match will compute that these two are exactly equivalent.

Enough said.

I embedded this into A SQLCLR function and it works like a charm.

Levenshtein and all your brethren really don't get the job done when all you really want to do is compare for similarity.

The web has the brightest ideas but try to look in all the dark corners.

Tags: , ,