Jun 28 2014

Comparing String Values for Similarities

Category: C# | SQL ServerAdministrator @ 12:41

Many moons ago I embarked on a proof-of-concept project to see if I could use SQL Server to perform a matching process.

It was successful but I still had a lingering suspicion that the underlying algorithm (Levenshtein) used to determine the sameness between human full names was less than optimal.  So what ensued was a period where I would look around the web, once in a while, to see if I missed something. 

I can't tell you how many algorithms I researched, converted to C# and tested.  Don't ask me why, but anyone who starts with Levenshtein will most likely never find this algorithm since so many others like Levenshtein vie for attention.

I finally found what I was looking for purely by accident:

Strike-A-Match: http://www.catalysoft.com/articles/StrikeAMatch.html

and kudos to the paste bin for the C# version: http://pastebin.com/EfcmR3Xx#

So now if you ever need to compare human full names whether they be in any order like:

  • last name, first name compared to first name, last name

Strike-A-Match will do the trick. Take a look at this comparison of a human name: "Jimi Hendrix" to "Hendrix Jimi"

Using Strike-A-Match will compute that these two are exactly equivalent.

Enough said.

I embedded this into A SQLCLR function and it works like a charm.

Levenshtein and all your brethren really don't get the job done when all you really want to do is compare for similarity.

The web has the brightest ideas but try to look in all the dark corners.

Tags: , ,

Feb 19 2013

SQL Server Async Stored Procedures

Category: SQL Server | Stored ProcedureAdministrator @ 08:05

As the saying goes: It's always better to be late than never and that goes double for this technique for executing stored procedures. 

What I'm referring to is a post from Remus Rusan, who explains in detail how to be able to run stored procedures in an asynchronous fashion:


In the set of two articles, he explains how to take a stored procedure, even with parameters, and run it without holding a client connection.

This is extremely useful if you need the ability for users to run long running procedures. 

You may be asking yourself why not just create a SQL Agent job and run that from the a client which has the same effect since SQL Agent jobs run asynchronously.  Well that's true but this technique is more flexible since you can pass parameters from the client whereas in the SQL Agent jobs there is no facility for passing parameters directly to the job.

He includes a detailed script for installing his helper stored procedures and the procedure for handling the Service Broker.  Yes I said Service Broker.

Don't get overly concerned since to use the SQL Service Broker is quite easy in setup and operational usage.

I won't go into his solution since you've probably already read that but what I offer here is just some small modifications which make it a little bit nicer to use.

To get started all you need is to have the Service Broker enabled by the DBA:



Once this is done you're good to install all the Stored Procedures from Remus.

Check the DBA:

SELECT is_broker from sys.databases where name = 'xxxxxx';


One of the modification I made to the set of procedures was to be able to see the Start_Time of the running Stored Procedure.

In Remus's routines, all execution is within a Transaction which proves invaluable since you'll need the rollback capability in the case of failure.

But the Start_Time I don't think qualifies as a need for the rollback of the transaction plus I want to display the Start_Time to the users who requested the run of the procedure.

So all I did was include an execution of a CLR procedure to update the Start_Time.  Since CLR procedures execute outside the transaction it updates the Start_time immediately.

I made the change here in the AsyncExecActivated procedure:


begin try;
    receive top (1).......

if (@messageTypeName = N'DEFAULT'....

--CLR procedure


I also included a way to cancel a waiting procedure which is next in line to be handled by Service Broker execution.

It's crude but it allows you to stack up requests and act on the behalf of the user to cancel inadvertent requests.

Just update the error_message with anything (before it runs) for that task and it will effectively be cancelled.

select @error_message = error_message from dbo.tblTaskResults
      where [token] = @token;

--see if the request was canceled
if (@messageTypeName = N'Default' and @error_message is null)


I've put together an ASP.NET Page which allows users to initiate and cancel queued requests:


I've included my script for those who would like to build onto my recipe for Asynchronous Procedure Execution:


I drive my whole web page and the Tasks available for execution based solely upon a SQL configuration table.

This technique frees the developer to add new tasks at anytime and no longer do you need to add more SQL Agent jobs.

Developers need time to drive into subject matter.  Heads down developing is a trap.

Tags: , , ,

Sep 30 2012

SSIS Package Execution with C# -- SQL Server

Category: C# | SQL ServerAdministrator @ 08:16

There are times when it seems like IT management decisions are arbitrary and capricious.  This is one of them!

As with most shops, we have SQL Agent running the SSIS production packages but when we migrate to newer servers we are now instructed to no longer use the SQL Agent for execution on the SQL Server. 

OK -- so what's the substitute? 

I'm told to use Bat files which are executed through some other agent tool or use xp_cmdshell.

I feel like I'm going backwards in time and not moving forward with how an operational environment should be architected.  But thankfully some clever individuals have already paved the path to a more beautiful world.

And that world is C# execution of the packages.  You may be asking what the heck I'm talking about but SSIS packages can run anywhere.  I know this sounds unconventional but you can execute packages as long as you have the dependent dlls and the appropriate .NET framework in that environment.  Take a read through this following blog entry and you'll quickly see how to get SSIS packages running from any Windows server and not just a Windows server running SQL server:

Running Packages from C#

The beauty of doing it this way is that you now can have any Windows server running a service which executes the packages and base that execution upon a database configuration. 


  • Configuration files:
    • Running the SSIS package through C# allows you to pass in "User Variables" to the packages.  So just read what you want from a SQL configuration table and pass it to the package.
    • That's right, no longer do you have to maintain config files in folders but move that maintenance to a configuration table in SQL server.
            pkgLocation = Path.Combine(pkgLocation, pkgName.Replace("\"", ""));
            DtsLogging mylogger = new DtsLogging();
            Application app = new Application();

            //Package pkg = app.LoadPackage(pkgLocation, eventListener);
            Package pkg = app.LoadPackage(pkgLocation,null);

            pkg.Variables["User::DmatchDataSource"].Value = pkgDmatchDataSource;
            pkg.Variables["User::DmatchUserId"].Value = pkgDmatchUserId;
            pkg.Variables["User::DmatchPassword"].Value = pkgDmatchPassword;
  • Error handling:
    • Make a consistent approach to your applications for error logging.  An errors collection is exposed from the C# package execution so that you can keep all your application logging in one place.
    • No more looking at an application log for one thing and SQL Agent history for another event.

In my case I wanted to capture the rows being sent across the wire (OnPipelineRowsSent) to SQL Server so now that can be captured with the Logging enabled:

            pkg.LoggingOptions.EventFilterKind = DTSEventFilterKind.Inclusion;
            pkg.LoggingOptions.EventFilter = new string[] { "OnPipelineRowsSent" };

            DTSEventColumnFilter ecf = new DTSEventColumnFilter();
            ecf.MessageText = true;
            pkg.LoggingOptions.SetColumnFilter("OnPipelineRowsSent", ecf);
            pkg.LoggingMode = DTSLoggingMode.Enabled;
            DTSExecResult pkgResults = pkg.Execute(null,null,null,mylogger,null);

Here is the DTSLogging class:

internal class DtsLogging : IDTSLogging

      public bool Enabled
      { get { return true; } 


      ulong rowsprocessed = 0;
      Stopwatch stpWatch = new Stopwatch();
      string pkgName = "";


      public void Initialize(string Pkgname)

          pkgName = Pkgname;


      public void Log(string eventName, string computerName, string operatorName, string sourceName, string sourceGuid, string executionGuid, string messageText, DateTime startTime, DateTime endTime, int dataCode, ref byte[] dataBytes)
          switch (eventName)
              case "OnPipelineRowsSent":
                      if (messageText == null)
                      if (messageText.StartsWith("Rows were provided to a data flow component as input."))
                          string rowsText = messageText.Substring(messageText.LastIndexOf(' '));
                          ulong rowsSent = ulong.Parse(rowsText);

                          if (messageText.Contains(" OLE DB Source Output "))

      public bool[] GetFilterStatus(ref string[] eventNames)

          //bool[] boolret = {};
          return new bool[] { };

      void LogRowProcessedInfo(ulong rowsSent)
          rowsprocessed += rowsSent;
          // Include further implementation for logging to db and text file.
          if (stpWatch.Elapsed.Minutes >= Convert.ToInt32( Config.Instance().EventMessageTimeInterval))
              eventLogSimple.WriteEntry("Pkg: " + pkgName + ", PipelineRowsSent: " + rowsprocessed.ToString());



In my case I made a Windows service to run the packages.  This way the C# code looks at a database to schedule when to execute a particular SSIS package.  The dtsx files are kept locally on the application server and the C# code loads them and runs them locally.  This ends up using the resources of the application server and there's no resource impact (my DBA loves this fact) felt on the SQL server.


Sometimes from miserable circumstances comes inspiration.

Tags: , , ,

May 20 2012

SQL Server 2012 -- Full-Text Search (Matching Engine)

Category: Administrator @ 06:48

Having been immersed in a Full-Text Search (FTS) proof-of-concept over the past two months I thought others would benefit from this experience.  First, let me start by saying that the way this came about was rather unusual.  A co-worker of mine had started to play with FTS and turned around one afternoon and asked me for some T-SQL help.  When I saw the language constructs of FTS that he was using I had what I call a h$!y s^!t moment.  The last time that I looked at using anything close was with SoundEx which was a mess for my purposes.  I quickly saw how I could make great use of this technology.  And I'm only scrapping the surface in what I'm doing but check out the video link below for more on what FTS is supposed to (documents) be used for:



In my day job we do lots and lots of matching.  By that I mean we get in files with a person's name and a name of a piece of copyrighted material.  So internally we have a database of person's names and their related pieces of copyrighted material.  What we need to do is take the incoming data and find the match in our database.  Sounds easy right.  Well not with our current technology.

My thought was to take FTS and use it on the database of data we currently have and then take the incoming data and try to find the closest match.

If you think you should deploy this on SQL 2008R2 then you'll not benefit from the changes in 2012 which increased performance dramatically.  And performance is the key to this whole matching process.

In my matching tables, I have over 44 million rows to search, bring back results and score the results.  I was able to attain just over four attempted matches per second.  Not bad considering that I'm running the Developer edition on my local workstation with just two crappy SATA drives.  And don't worry about space consumption since for my test case of 44 million rows, I used just under one gig for the FT index after the population of the FT index completed.

Let start:

First you'll need to install FTS which is no big deal and I'll leave that to you.


Now the important steps:

  • Create a set of FileGroups (one for each table you'll be using to search/indexing.)  The best case scenario is to have many disks and split the table's clustered index on the searched tables away from the FT index.

( NAME = DmatchdatX,
  FILENAME = 'X:\MSSQL\data\DmatchdatX.ndf',
  SIZE = 20000MB,
  MAXSIZE = 28000MB,


( NAME = DmatchdatY,
  FILENAME = 'Y:\MSSQL\data\DmatchdatY.ndf',
  SIZE = 20000MB,
  MAXSIZE = 28000MB,


--REMOVE FILE DmatchdatX

--REMOVE FILE DmatchdatY


  • Create a FTS Catalog:
USE Dmatch;

USE Dmatch;



  • Create a StopList -- for this example I'm not going into what entirely I added and took away from the StopList but I'll include the SQL to show how:


--drop fulltext stoplist Dmatch1StopList;
Create FullText StopList Dmatch1StopList from System Stoplist;

  • Modify the StopList to suit your needs:
alter fulltext stoplist Dmatch1StopList  drop 'about' language 1033;
alter fulltext stoplist Dmatch1StopList  add 'company' language 1033;
  • Create a Full-Text Index


--DROP FULLTEXT INDEX ON tblWrkPtyWriterSearch

--Create FTS index on work and Pty name

CREATE FULLTEXT INDEX ON tblWrkPtyWriterSearch(WrkNa,PtyNa) 
   KEY INDEX ix1WrkPtySearch on ([DmatchWrkPtyWrtCatalog], FILEGROUP [ fgDmatchY])
   WITH STOPLIST = Dmatch1StopList;

--DROP FULLTEXT INDEX ON tblWrkPtyPublisherSearch

--Create FTS index on work and Pty name

CREATE FULLTEXT INDEX ON tblWrkPtyPublisherSearch(WrkNa,PtyNa) 
   KEY INDEX ix1WrkPtySearch on ([DmatchWrkPtyPubCatalog], FILEGROUP [fgDmatchX])
   WITH STOPLIST = Dmatch1StopList;


  •  Check the status of the Population of the FT index
-- Number of full-text indexed items currently in the full-text catalog 
-- plus the population status and size:

SELECT 'DmatchWrkPtyWrtCatalog'
,FULLTEXTCATALOGPROPERTY('DmatchWrkPtyWrtCatalog', 'PopulateStatus') AS [Populate Status]
,FULLTEXTCATALOGPROPERTY('DmatchWrkPtyWrtCatalog', 'ItemCount')AS [Item Count]
, FULLTEXTCATALOGPROPERTY('DmatchWrkPtyWrtCatalog', 'IndexSize')AS [Size in MB];

SELECT 'DmatchWrkPtyPubCatalog'
,FULLTEXTCATALOGPROPERTY('DmatchWrkPtyPubCatalog', 'PopulateStatus') AS [Populate Status]
,FULLTEXTCATALOGPROPERTY('DmatchWrkPtyPubCatalog', 'ItemCount')AS [Item Count]
, FULLTEXTCATALOGPROPERTY('DmatchWrkPtyPubCatalog', 'IndexSize')AS [Size in MB];


  • Create procs and functions to return possible matches:

What I did here was to use the ContainsTable-SQL in conjunction with other functions to return a result of the top (x) matches.  I did the matching/scoring with help of the Levenshtein algorithm.

My overall match rate was just over 73% and of excellent quality, meaning that I didn't match to something that was not supposed to match.

I'll continue the next blog entry with the actual stored procedures and functions that really do the work.

Full-Text Search on SQL 2012 is a gift.








Tags: , , , , ,

Jan 16 2012

SwimEventTimes W7P Development -- Live Tile

Category: Administrator @ 12:18

If you have an app then you owe it to yourself and users of your app to get something relevant onto the "pinned" Application Tile.  For a user who has "pinned" your application they will quickly expect something to appear on the back portion of the application tile.  This is what people consider the "cool" factor of owning a Windows Phone.  Having my phone open in the elevator with lots of flipping tiles makes the Windows 7 Phone look impressive to those other phone owners.  If your application tile does not flip and reveal any tidbits of information then your app is doomed to live unpinned and probably unused.

What I decided to do for my app was to have the user of the app select which swimmer(s) stat would appear on the Live Tile.  Then when the agent code (usually every 30 minutes) runs it will select a ("Best") random stat from the selected swimmer(s).  This way every 30 minutes I get a new stat for a swimmer on the back of the Live Tile.  You only have about 45 characters to play with so be selective about what you will show on the back of the Live Tile.


Now for the code:

From your main page include the start of the agent in the constructor:

// Constructor
        public RequestedSwimmersPage()

I've added a separate class to handle the use of the agent.  Keeps the code cleaner for future changes:

public static class AgentMgr
        private const String AgentName = "SwimmerAgent";
        private const String AgentDescription = "Custom background agent for Swim Event Times pinned Tile!";

        public static void StartAgent()

            PeriodicTask task = new PeriodicTask(AgentName);
            task.Description = AgentDescription;
#if DEBUG 
        // If we're debugging, attempt to start the task immediately
            ScheduledActionService.LaunchForTest(AgentName, new TimeSpan(0, 0, 1)); 

        public  static void StopAgentIfStarted()
            if (ScheduledActionService.Find(AgentName) != null)


So to get things moving you'll need a separate project (for the Agent worker) added to your solution to have the agent run.  Select the "Scheduled Task Agent" project type when adding the project to your solution.  Once you have a separate (Agent) project then add a reference to that from your main application.

It runs on a 30 minute interval so the data that I'm pulling will change on the back of the pinned Application Tile on that interval.

Now the Agent project will have one class and one method (OnInvoke) that you need to be concerned with.  Place your code to get new information onto the back tile in this method:

/// <summary>
        /// Agent that runs a scheduled task
        /// </summary>
        /// <param name="task">
        /// The invoked task
        /// </param>
        /// <remarks>
        /// This method is called when a periodic or resource intensive task is invoked
        /// </remarks>
        protected override void OnInvoke(ScheduledTask task)
            //Here is where you put the meat of the work to be done by the agent

            //get the data from ISO 
            ObservableCollection<Swimmer> Swimmers = BackgroundAgentRESTCall.GetSwimmers();
            ObservableCollection<RequestedSwimmer> RequestedSwimmers = BackgroundAgentRESTCall.GetRequestedSwimmers();

            //pull out those swimmers who have the Live Tile turned on
            //and only those that match the 'Course' selection
            IEnumerable<Swimmer> tileSwimmers = from swimmer in Swimmers
                                            join reqswimmer in RequestedSwimmers on swimmer.SwimmerID equals reqswimmer.SwimmerID
                                            where reqswimmer.UseOnLiveTile == true
                                            && reqswimmer.BestCourseSelection == swimmer.Course
                                            select swimmer;

            string tileText = String.Empty;
            string titleText = string.Empty;
            //make sure that we have at least one stroke for a swimmer.
            if (tileSwimmers.Count() != 0)

                //pull out the best swims for all marked (Live Tile) swimmers
                var bestswims = from p in tileSwimmers
                                //where conditions or joins with other tables to be included here                           
                                group p by p.SwimmerID + p.Stroke + p.Course + p.Distance into grp
                                let MinTime = grp.Min(g => g.TimeSecs)
                                from p in grp
                                where p.TimeSecs == MinTime
                                orderby p.Course descending, p.StrokeOrder, p.Distance
                                select p;

                int count = bestswims.Count(); // 1st round-trip 
                int index = new Random().Next(count);

                var tileSwimmer = bestswims.Skip(index).FirstOrDefault(); // 2nd round-trip 

                //foreach (Swimmer tileSwimmer in tileSwim)
                //Debug.WriteLine("WWW-data.LastProcessToTouchFile=" + tileSwimmer.AltAdjTime);
                titleText = tileSwimmer.Distance + tileSwimmer.Stroke + ": " + tileSwimmer.Time;

                Guid NavGUID;
                NavGUID = (Guid)tileSwimmer.SwimmerID;

                var tileReqSwimmer = (from reqswimmer in RequestedSwimmers
                                      where reqswimmer.SwimmerID == NavGUID
                                      select reqswimmer).FirstOrDefault();

                int LastnamLen = tileReqSwimmer.LastName.Length;
                if (LastnamLen > 9)
                    LastnamLen = 9;

                int StandardLen = tileSwimmer.Standard.Length;
                if (StandardLen > 6)
                    StandardLen = 6;

                tileText = tileReqSwimmer.FirstName.Substring(0, 1) + tileReqSwimmer.LastName.Substring(0, LastnamLen)
                + " Age: " + tileSwimmer.Age + "   "
                + String.Format("{0:MMM yyyy}", tileSwimmer.MeetDate) + "  "
                + tileSwimmer.Course + "  " + tileSwimmer.Standard.Substring(0, StandardLen);
                tileText = "Select a Swimmer to be shown.";
                titleText = "Event & Time";

            UpdateAppTile(tileText, titleText);


The above routine gets the data from Isolated Storage. 

Note: I used a mutex in the call for the ISO data since we could be writing to the same storage at the time we are trying to read from it.

  I made heavy use of the sample mutex code listed from the link below:

Then I used LINQ to pull out the data needed, format it and call the Update method for the Tile:

private void UpdateAppTile(string message, string backTitleText)
            ShellTile appTile = ShellTile.ActiveTiles.First();
            if (appTile != null)
                StandardTileData tileData = new StandardTileData
                    BackContent = message
                    ,BackBackgroundImage = new Uri("/Images/LiveTileC173x173.png", UriKind.Relative)
                    ,BackTitle = backTitleText


That gives us a pinned Application Tile with swimmer(s) changing info. 

Swimmers like it since it randomly rotates through their "Best" swim times and it gives the app that cool flipping tile that other phone owners love to hate.





Tags: , , , , ,