Random Dev Notes

February 20, 2011

LINQ to VFP’s usage of XmlToCursor

Filed under: Development — Tags: — Tom Brothers @ 8:55 pm

About a year ago I wrote a blog entry explaining a change that was made to LINQ to VFP to help prevent the VFP error “1812 – Statement too long.”  To be honest, I was never really satisfied with the solution… but at the time it seemed to be the only way I could deal with this annoying VFP error.  Fortunately though, a much more knowledgeable VFP developer, Cetin Basoz, stumbled upon my blog entry and offered a more elegant solution to this problem.  He pointed me to a link, that pre-dated my blog entry by a couple years, that suggested using the VFP’s XmlToCursor function instead of creating temp tables.  Needless to say, I took his advice and modified LINQ to VFP accordingly.

In this blog entry, I will provide more details on a scenario that resulted in the VFP error, how it can be dealt with using ADO.Net, and the changes made to LINQ to VFP to avoid this error.

First off, let me say that what you are about to see here is a dramatization of actual events.  The database names and table names have been changed to protect the innocent.  🙂

As a VFP developer, I’ve picked up a project that has been in production for many years.  Prior to my involvement, a senior developer convinced the project owner that the company needed to move away from DBFs and start using SQL Server.  The convincing argument had something to do with preventing corrupted memo files and index files.  The product owner was tired of hearing from the support team that the customer had to lose some data due to these issues.  In an effort to move forward, it was decided that new tables could only be created in SQL Server.  Existing table would be move to SQL Server at a later time to be determined when it is more convenient.  Unfortunately, the project owners never found a convenient time for converting the core tables.  This ultimately resulted in a fractured data system which consequently produced less than optimal query performance. 

This chain of events provides a very general idea of what led to the VFP error.  Now lets dig into this idea a little more with a concrete example using the Northwind database.

Setting up the scenario…

Assume that at one time an Order could be for only one Product and all the information was stored in the Orders table.  This data model seemed to satisfy the business needs for awhile but as time went on various pain points were attributed to this design.  Consequently, it was decided that it was time to create an Orders Details table so that multiple Products could be part of a single Order.  The new Order Details table would be created in SQL Server while the Orders and Products tables remained DBFs.

… with a basic scenario explained… time to move onto a work item

The product owner decided that what the application could really use is a report that provides a list of Customers that have received discounts on any Products.  This would have been a trivial work item had the data been in one data system but now that it spread across two data systems this will require several steps.

The basic steps are:

  1. Query SQL data for a list of Order Details records that have a discount
  2. Get a distinct list of Order Ids from the discounted Order Details
  3. Query VFP data for the Customer Information

Here is a code example of the steps listed above:

SqlData sqlData = new SqlData();
VfpData vfpData = new VfpData();
 
// Step 1:  get the OrderDetails that have a discount
DataTable orderDetails = sqlData.CreateDataTable("SELECT * FROM [Order Details] WHERE discount > 0");
 
// Step 2:  get a distinct list of order ids
string[] orderIds = orderDetails.AsEnumerable()
                             .Select(row => row.Field<int>("OrderId").ToString())
                             .Distinct()
                             .ToArray();
 
// Step 3:  
//          build select statement that looks for all orders specified in the orderIds array
string sql = string.Format(@"
                select o.OrderId, c.CustomerId, c.CompanyName
                    from Orders o
                    inner join Customers c on upper(allt(o.CustomerId)) == upper(allt(c.CustomerId))
                    where o.OrderId in ({0})", string.Join(",", orderIds));
 
//          execute select statement... results in vfp error
DataTable orderCustomerInfo = vfpData.CreateDataTable(sql);

Here is the select statement VFP couldn’t handle:

select o.OrderId, c.CustomerId, c.CompanyName
    from Orders o
    inner join Customers c on upper(allt(o.CustomerId)) == upper(allt(c.CustomerId))
    where o.OrderId in (10250,10251,10252,10254,10258,10260,10262,10263,10264,10266,10267,10269,10273,10275,10279,10284,10285,10287,10288,10291,10298,10303,10305,10314,10324,10327,10329,10330,10332,10333,10335,10336,10339,10340,10341,10342,10343,10344,10346,10347,10348,10350,10351,10352,10353,10357,10358,10359,10361,10368,10369,10370,10371,10372,10373,10376,10377,10379,10380,10385,10388,10390,10393,10395,10397,10398,10403,10404,10406,10411,10412,10414,10417,10419,10420,10421,10424,10425,10429,10430,10431,10434,10436,10438,10440,10443,10446,10450,10451,10452,10453,10454,10456,10459,10460,10461,10464,10465,10469,10472,10475,10476,10477,10478,10483,10485,10487,10488,10489,10491,10492,10493,10496,10500,10506,10507,10510,10511,10512,10513,10515,10516,10519,10522,10523,10525,10526,10527,10528,10533,10534,10535,10536,10541,10542,10543,10547,10548,10549,10550,10551,10554,10555,10559,10560,10562,10564,10565,10566,10567,10569,10570,10571,10572,10580,10581,10583,10584,10586,10588,10590,10592,10593,10595,10596,10597,10602,10603,10604,10605,10606,10610,10613,10616,10617,10622,10623,10627,10630,10631,10632,10633,10635,10637,10640,10642,10643,10644,10646,10648,10650,10651,10652,10653,10654,10655,10656,10658,10659,10661,10663,10664,10667,10668,10672,10677,10680,10681,10686,10687,10688,10689,10690,10693,10697,10698,10700,10701,10707,10712,10714,10717,10719,10721,10727,10730,10731,10735,10740,10741,10743,10744,10750,10751,10755,10756,10760,10761,10764,10765,10769,10770,10773,10774,10776,10777,10781,10784,10786,10787,10788,10790,10791,10794,10795,10796,10799,10800,10801,10802,10803,10804,10806,10808,10812,10813,10814,10816,10817,10823,10832,10833,10834,10835,10837,10838,10839,10840,10843,10845,10847,10849,10850,10851,10854,10855,10857,10859,10863,10865,10866,10868,10871,10872,10875,10877,10878,10880,10882,10884,10891,10892,10894,10899,10900,10902,10905,10908,10912,10913,10918,10923,10924,10925,10930,10931,10932,10935,10936,10938,10939,10941,10944,10951,10952,10953,10954,10955,10959,10960,10961,10963,10966,10970,10978,10980,10983,10985,10988,10990,10991,10993,10994,10997,10999,11000,11002,11006,11008,11009,11011,11012,11014,11020,11021,11025,11027,11030,11033,11034,11038,11041,11046,11047,11049,11050,11051,11052,11053,11062,11063,11064,11065,11068,11070,11071,11074,11075,11076,11077)

Finally, after a long winded setup of the scenario I’ve gotten back to the decision point in my last post where I decided I needed to use temp tables to get passed this issue.

Here is the code example for the same 3 basic steps only this time using XmlToCursor:

SqlData sqlData = new SqlData();
VfpData vfpData = new VfpData();
 
// Step 1:  get the OrderDetails that have a discount
DataTable orderDetails = sqlData.CreateDataTable("SELECT * FROM [Order Details] WHERE discount > 0");
 
// Step 2:  get a distinct list of order ids
int[] orderIds = orderDetails.AsEnumerable()
                             .Select(row => row.Field<int>("OrderId"))
                             .Distinct()
                             .ToArray();
 
// Step 3:  
//          build xml
StringBuilder sb = new StringBuilder();
sb.Append("<VFPData>");
 
foreach (int orderId in orderIds) {
    sb.Append("<curTempIdList Id='");
    sb.Append(orderId);
    sb.Append("' />");
}
 
sb.Append("</VFPData>");
string xml = sb.ToString();
 
DataTable orderCustomerInfo;
 
using (OleDbCommand command = vfpData.CreateCommand()) {
    //      build select statement that looks for all orders specified in the orderIds array
    //      using XmlToCursor
    command.CommandText = @"
        select o.OrderId, c.CustomerId, c.CompanyName
            from Orders o
            inner join Customers c on upper(allt(o.CustomerId)) == upper(allt(c.CustomerId))
            where o.OrderId in (select Id from (iif(XmlToCursor(?, 'curTempIdList') > 0, 'curTempIdList', '')))";
 
    command.Parameters.Add("Xml", xml);
 
    //          execute select statement... executes properly without a VFP error
    orderCustomerInfo = vfpData.CreateDataTable(command);
}

Here is the selected statement:

select o.OrderId, c.CustomerId, c.CompanyName
    from Orders o
    inner join Customers c on upper(allt(o.CustomerId)) == upper(allt(c.CustomerId))
    where o.OrderId in (select Id from (iif(XmlToCursor(?, 'curTempIdList') > 0, 'curTempIdList', '')))

There is one main consideration that you would want to keep in mind when using XmlToCursor as I have in this example.  The generated xml didn’t include any schema information which resulted in VFP making an assumption as to the expected data type.  VFP’s assumption worked out for this example.  But if the OrderId column had been a string this query would have failed due to a data type mismatch.  To correct this issue, the xml would have to include a schema that explicitly identified the data type.

Here is an example querying the Customers table which has the CustomerId column define as a string.  In this example the XmlToCursor function creates a cursor with the Id column being an int instead of a string which results in an “Operator/operant type mismatch” exception:

VfpData vfpData = new VfpData();
 
// notice that this string array contains only numbers
string[] customerIds = { "1", "2" };
 
StringBuilder sb = new StringBuilder();
sb.Append("<VFPData>");
 
foreach (string customerId in customerIds) {
    sb.Append("<curTempIdList Id='");
    sb.Append(customerId);
    sb.Append("' />");
}
 
sb.Append("</VFPData>");
string xml = sb.ToString();
 
DataTable customers;
 
using (OleDbCommand command = vfpData.CreateCommand()) {
    command.CommandText = @"
        select *
            from Customers 
            where CustomerId in (select Id from (iif(XmlToCursor(?, 'curTempIdList') > 0, 'curTempIdList', '')))";
 
    command.Parameters.Add("Xml", xml);
 
    // causes "Operator/operand type mismatch." exception
    customers = vfpData.CreateDataTable(command);
}

To help deal with this issue, I wrote a class that would convert an array to XmlToCursor formatted xml which would include the schema when necessary.

class XmlToCursor {
    private readonly XNamespace xsd = "http://www.w3.org/2001/XMLSchema";
    private readonly XNamespace msdata = "urn:schemas-microsoft-com:xml-msdata";
    private const string ROOTNAME = "VFPData";
    public readonly string CursorName = "curTempIdList";
    public readonly string ColumnName = "Id";
    private Type arrayType;
    
    public string GetXml(Array array) {
        this.arrayType = this.GetArrayType(array);
 
        XDocument xdoc = new XDocument();
        XElement root = new XElement(ROOTNAME);
        xdoc.Add(root);
 
        XElement schema = this.GetSchemaElement();
 
        if (schema != null) {
            root.Add(schema);
        }
 
        foreach (var item in array) {
            if (item == null) {
                root.Add(new XElement(CursorName));
            }
            else {
                root.Add(new XElement(CursorName,
                            new XAttribute(ColumnName, item)));
            }
        }
 
        return xdoc.ToString();
    }
 
    private Type GetArrayType(Array array) {
        Type type = array.GetType();
 
        // Try to get the actual array type incase it requires schema information in the xml.
        if (type.FullName == "System.Object[]" && array.Length > 0) {
            type = Type.GetType(array.GetValue(0).GetType().FullName + "[]");
        }
 
        return type;
    }
 
    private XElement GetSchemaElement() {
        XElement attribute = this.GetAttributeElement();
 
        if (attribute == null) {
            return null;
        }
 
        return new XElement(xsd + "schema",
                new XAttribute("id", ROOTNAME),
                new XAttribute(XNamespace.Xmlns + "xsd", xsd.NamespaceName),
                new XAttribute(XNamespace.Xmlns + "msdata", msdata.NamespaceName),
                new XElement(xsd + "element",
                    new XAttribute("name", ROOTNAME),
                    new XAttribute(msdata + "IsDataSet", true),
                    new XElement(xsd + "complexType",
                        new XElement(xsd + "choice",
                            new XAttribute("maxOccurs", "unbounded"),
                            new XElement(xsd + "element",
                                new XAttribute("name", CursorName),
                                new XAttribute("minOccurs", 0),
                                new XAttribute("maxOccurs", "unbounded"),
                                new XElement(xsd + "complexType",
                                    attribute,
                                    new XElement(xsd + "anyAttribute",
                                        new XAttribute("namespace", "http://www.w3.org/XML/1998/namespace"),
                                        new XAttribute("processContext", "lax"))))))));
    }
 
    private XElement GetAttributeElement() {
        XElement complexType = null;
 
        switch (this.arrayType.FullName) {
            case "System.String[]":
                complexType = this.GetVarCharAttributeElement();
                break;
            default:
                return null;
        }
 
        return complexType;
    }
 
    private XElement GetVarCharAttributeElement() {
        const int MAX_VARCHAR_LENGTH = 254;
 
        return new XElement(xsd + "attribute",
                new XAttribute("name", this.ColumnName),
                new XAttribute("use", "required"),
                new XElement(xsd + "simpleType",
                    new XElement(xsd + "restriction",
                    new XAttribute("base", "xsd:string"),
                    new XElement(xsd + "maxLength",
                        new XAttribute("value", MAX_VARCHAR_LENGTH)))));
    }
}

Here is an example of the Customer query rewritten to use the XmlToCursor class:

VfpData vfpData = new VfpData();
 
// notice that this string array contains only numbers
string[] customerIds = { "1", "2" };
XmlToCursor xmlToCursor = new XmlToCursor();
string xml = xmlToCursor.GetXml(customerIds);
 
DataTable customers;
 
using (OleDbCommand command = vfpData.CreateCommand()) {
    command.CommandText = @"
        select *
            from Customers 
            where CustomerId in (select Id from (iif(XmlToCursor(?, 'curTempIdList') > 0, 'curTempIdList', '')))";
 
    command.Parameters.Add("Xml", xml);
 
    // the query succeeds
    customers = vfpData.CreateDataTable(command);
}

At this point, I’ve explained a basic scenario that has required the XmlToCursor function.  I’ve also shown how the function can be used in ADO.net.  So how does this XmlToCursor stuff affect LINQ to VFP…?

Before becoming aware of the potential usage of XmlToCursor I had included some temp table helper classes in LINQ to VFP.  These classes got the job done but resulted in some very messy code because it required branching the code to handle queries that contained too many array items (see code below).  In addition to less than desirable code, the LINQ to VFP developer had to evaluate each query to determine if it had potential of causing the “statement too long” error. 

NorthwindDataContext context = new NorthwindDataContext(connectionString);
 
int[] ids = context.List<Order>().OrderBy(o => o.OrderID).Take(160).Select(o => o.OrderID).ToArray();
List<Order> list;
 
if (ids.Length > 100) {
    using (TempTable<TempTableIntId> tempTable = context.Provider.CreateTempTable(ids)) {
        list = (from o in context.List<Order>()
                join t in tempTable.EntityTable on o.OrderID equals t.Id
                select o).ToList();
    }
}
else {
    list = (from o in context.List<Order>()
            where ids.Contains(o.OrderID)
            select o).ToList();
}
 
return list;

What I think would have been much better is if LINQ to VFP evaluated the query to determine the array length so that it could modify the select statement as needed.  In retrospect, I probably could have made the select statement rewrite happen with the temp tables but it just didn’t seem like the right thing to do at the time.  Nevertheless, this essentially describes the process that was included when LINQ to VFP was changed to use XmlToCursor.  In addition to this new process I have added a new setting that is used to opt-in to the whole XmlToCursor rewrite process.

Here is the previous code example rewritten so that the sql select statement includes the XmlToCursor function.

// can be set once in the application startup 
VfpQueryProvider.XmlToCursorMinimumItemCount = 100;
 
NorthwindDataContext context = new NorthwindDataContext(connectionString);
 
int[] ids = context.List<Order>().OrderBy(o => o.OrderID).Take(160).Select(o => o.OrderID).ToArray();
 
List<Order> list = (from o in context.List<Order>()
            where ids.Contains(o.OrderID)
            select o).ToList();
 
return list;

Blog at WordPress.com.