HyperList

by Todd Geist

Overview

(Update Version 2.0.1 1/29/2014 – fixed the bug reported by Tim Anderson – more info here)

(Update Version 2.0 1/8/2014 – added support for up to 5 fields, detects and uses ListOf)
Post and video over at geistinteractive.com

(Update 10/7/2013 – added trailing return to maintain backwards compatibility )

(Update 8/20/2013 – new and improved faster list function.  2 to 3 times faster then the previous version. )

(update 7/2/2013 fixed a bug that caused HyperList to drop records when hitting found counts that ended in “01”.  101, 301, 100201, etc.)

HyperList is a very fast, completely abstracted module for capturing a found set of Primary Keys to a variable, for use in FileMaker Virtual List techniques.  It has been shown to be faster than any other known method for gathering IDs in the current found set.  Yes it is faster than the Custom List Custom function. It is capable of capturing 80,000 primary keys in about 10 seconds. As of version 2.0 it can gather up to 5 fields per record.

  • Works on any found set
  • Uses normal FileMaker sorting order.
  • Can handle 10s of thousands of records.
  • It works on FileMaker Go and FileMaker Pro

Credits

Sam and Jesse Barnum of 360Works.com, figured out that string variables in FileMaker are probably immutable, which means they are never changed, only copied.  They also discovered that if you are building long strings with a looping FileMaker script, it can be faster if you write each loop’s string to it’s own $variable and then later compact those variables back into one variable.  HyperList builds on this insight.

Jason Young of seedcode.com built the multiple field support for version 2.0

Dependancies

None

License

Completely Free!

Support

email me for now

Download

Here it is…

HyperList2.0.fmp12

48 responses to “HyperList”

  1. Hi,
    nice!
    …but still not as fast as Copy All Records.

    • Todd Geist says:

      True, but “copy all records” messes with the clipboard, and requires at least two more layouts, one to place the key field on and another to receive the paste. HyperList doesn’t need any of that. 🙂

  2. Absolutely. I modified your file a bit, can I post it here?
    I added an faster technique with some dependencies though :
    – a hard coded script step per field 🙁
    – a gobal field
    as well as a couple of other techniques which happen to be much slower for large foundsets.

    • Todd Geist says:

      Hi Fabrice,

      I just got your email with the modified file. I will take a look at it and see if there is anything I want to pull into HyperList. I am not sure if I will want to or not. Hyperlist is both Fast and Easy to install. I know there are faster methods to gather IDs, but none that I know of that are so simple to install. And easy installation and maintenance is very important. Plus HyperList is already fast enough for most use cases.

      If you would like to create your own module with your methods to share on this site you are more than welcome. Just register on this site and I will upgrade your account so you can post your module. Please take minute to read our documentation on how to build modules first (http://www.modularfilemaker.org/documentation/) if you haven’t already.

      Thanks

      Todd

  3. oh, by the way, you forgot to add the “Tell this bloody mac fans they can slow down now” 😉

  4. […] get the RecordID parsing optimized (or at least faster) using some of the techniques in Todd’s Hyperlist demo.  Although this is not the slow part of the process–feeding the large set of values into the […]

  5. Donovan says:

    I love the idea of providing this as a testable module that anyone can incorporate. It’s long overdue!

    Ralph Learmont, I believe, came up with a technique that tested twice as fast for me. I picked it up out of a demo several years ago that Matt Navarre provided. I haven’t dug very deep to determine the crucial differences. They both rely on chunking the data set, although Ralph’s doesn’t use Evaluate. Here’s a link to a copy of your file with my version of the script added. Note, it’s not exactly modified for use as a module. https://dl.dropboxusercontent.com/u/27269482/HyperList_Faster.zip

    • Todd Geist says:

      Oh yeah there are some good things in there. It uses the same two basic techniques of Batching records in groups of 100, and writing to separation variables for each loop. But it doesn’t require evaluate() when doing the batching, and it uses a different method to compacting the separate variables.

      I am not sure which of those two things accounts for the speed increase. Maybe both. But I know from other experiments that evaluate() is about twice as slow as without, so If I had to guess I would say it was that step.

      I am going to see about working this back into HyperList core and see if I can improve it some more.

      Thanks for posting this back, Donovan

      Todd

      • Donovan says:

        It would be great to see another rev of HyperList to include the faster techniques. One catch is that the field can’t be related. It might be helpful to have an alternative script that supports related fields, just in case. I’ve been using a wrapper that tests the field and then determines which method to apply.

        PS – Sorry for the late response. I didn’t realize I wouldn’t be getting an email notification. Then I got busy with other things.

        • Todd Geist says:

          I haven’t gotten around to trying to speed it up some more. I will at some point before DevCon.

          I haven’t ever needed to use this on a related field. Whats the use case for that?

          I do think it would be good to build a version of HyperList that could take a list of fields. It seems do-able.

  6. Simon Plint says:

    Todd,

    I have been doing a lot of working developing a template solution with Hyperlist, Master Detail, Navigation and Restore Tabs. Thanks for this great resource.

    I was recently working with a set of 4301 records and found that only 4200 rows were produced in the sidebar of MasterDetail. I stepped through “Get Values in Found Set ( FieldName )” in the HyperList module and found that the main loop ended on the 4200th record. The next script step sets the variable $expression to “GetN” but it should be setting it to an expression that will capture the rest of the records.

    Let(
    [
    s = $expression;
    n=Get( FoundCount) – Get(RecordNumber)+1;
    sep = “$Sep”;
    p = Position( s; sep; 1; n ) + Length(sep)
    ];

    Left(s;p)

    )

    n is being set to 101 (4301-4200+1)
    such that Position(s; sep; 1; 101) = “”. I guess because there isn’t a 101th occurrence of sep.

    Could it be that there is an issue when the $totalRecordCount is not a multiple of 100.

    You can reproduce this if you take a copy of MasterDetail and reduce the number of Contact records down from 40,000 to 4301. You’ll see there are blank rows if you scroll to the end of the sidebar. If you delete a single record and rebuild the sidebar the blanks rows disappear.

    Sure hope i’m doing something wrong or that this is a bug you can fix.

    Regards,
    Simon.

  7. Simon Plint says:

    Todd,

    Just did a few more experiments with MasterDetal and a found set of 101 records seems to break the sidebar. 99 is OK and so is 102 but 101 or 201 ext isn’t???

    Regards,
    Simon.

    • Todd Geist says:

      Hi Simon,

      You are correct. There is a bug in the current deliverable. I already have a fix for this. I’ll be posting it later today.

      Thanks

      Todd

  8. Simon Plint says:

    Excellent!

  9. Daniel Wood says:

    Apologies for the long post but I hope it can prove some benefit. I’m not trying to bash Hyperlistv2 with this post – far from it! I love the technique and use it myself to grab IDs for VirtualList, but I want to bring attention to real-world scenarios and an alternate method when it comes to the “ID” part.

    In the example file, it is shown that HyperListV2 outperforms the other tested methods in obtaining an ID from records in a found set – however in a situation where the only field in the table is that ID, and when running a local copy of the database. Unfortunately this is not a real world scenario.

    Lets say in the real world a normal table might have 20-30 fields each with various sizes of data. Lets also say that in a real world most solutions exist on FileMaker server either on the local network or remotely – things behave a little differently when the data is not residing on your machine.

    The problem that HyperlistV2 and all these other methods of ID collection suffer from when the file goes hosted is the old adage “you touch it you own it”. These methods loop through records, and subsequently each record looped to will be downloaded from server to client in its entirety (minus summary, unstored calcs & containers). Record size is the #1 determining factor of performance for most database solutions as far as I’m concerned and it’s a shame that most people seem oblivious to this fact 🙁 In the case of this example file, a record set of 200,000 is used to begin to show differences of just seconds between the methods – but in WAN/LAN deployments these differences are huge.

    Okay you say, so even if the example file had ‘real world’ records, and was hosted on FileMaker server, all methods still use a level playing field so to speak, and so the difference in performance still comes down to the method used to build the list of IDs – that is true.

    However – we are talking about for use in virtual lists here. Virtual list is a “temporary” tool. What I mean by this is it is used lots of times for reasons which do not involve storage of data. It’s a reporting tool, a UI tool, it gets wiped on system close and can be used by multiple users and so on – the records are meaningless without an ID.

    So why bother using primary keys at all? A primary key is a stored piece of data and as such to build your list of IDs every record of every ID you add to this list is downloaded from server to client, causing huge performance issues.

    You can still obtain a unique “temporary” identifier for a record for the purposes of virtual list – and that is the Get ( RecordID ) function. Now before you say “but you cannot use that as a primary key because it resets after you clone the database!” you are right – but we are talking temporary here. We are only using it to reference the actual record, not as a stored piece of data. If the database is cloned it is no big deal (in other words we are not using it as a foreign key).

    The method goes as such – add an unstored calc to your table that contains get ( recordID ). Also add a stored version of the calculation. Now, if you tried to obtain the stored version of the calculation then it acts just like any piece of stored data on the record – and FileMaker Server will send you the whole record. BUT, if you grab the unstored version of the calculation, FileMaker Server is somehow designed in such a way that it can send you just the record ID without sending you the entire record itself (maybe through index or some other means). And it’s FAST, VERY FAST!

    —-

    So here is the test. First I added 6 new fields to the table and put a timestamp in each field on each record. The size of the file was at around 60mb. Then I made a copy of the Hyperlist file which I will call “HyperlistX” In HyperlistX I change the ID field to an unstored calc set to Get ( recordID ), and I added another field called “ID Stored” which was Get ( RecordID ) as a stored calc.

    So both files exactly the same, large record set, large data size, the only difference being that HyperlistX has an unstored ID and a stored ID.

    Both files I put up on a remote FileMaker Server host – accessed over the internet on a constant 5mbit connection.

    For the test, I skipped all other methods and stuck to Hyperlist V2 and configured the script to process 200,000 records.

    The results – Well the original HyperlistV2 took so long I had to stop it – basically it was downloading the entire database and sending requests for each individual record. So this test was never finished. I then reduced the test to just grab 20,000 records instead (after flushing the cache of course). This took 50 seconds to grab the IDs of those 20,000 records – because it was in fact grabbing the entire records.

    Next, HyperlistX was tested on the same 20,000 records. The time – 0 seconds (or somewhere between 0 and 1 in other words) – I kid you not. I almost couldn’t believe it myself so I checked the scripts and sure enough all the IDs were in the variable as expected.

    So to take it to the limit I ran the full 200,000 test – remember this is getting 200,000 IDs via an unstored get ( recordID ) field – but looping through the found set using HyperList method nonetheless.
    The time to get all 200,000 IDs was 7 seconds. That’s comparable to Hyperlistv2 running on the local machine!

    With these IDs you just chuck them into VirtualList, and setup the relationship between VL and your actual table from the VL ID you set, to the stored version of Get ( RecordID ).

    —-

    So what I hope to have shown here is that VirtualList being a temporary data structure can be satisfactorily used with a similar temporary unique ID on the table(s) you are using – and that this temporary unique ID is actually significantly faster than any other method, be it local, WAN or LAN deployment.

    • Todd Geist says:

      HI Dan,

      Thanks for your comments. It seems like you are just proposing that the ID you gather is the Is the un-stored record id. And then you match that back to a stored version of the same. This seems to me to be a perfectly valid HyperList use to me. HyperList’s main function is to gather any field and that is what you are doing. So very nice job!

      The new info for me here is that gathering unstored rec-ids over the network is very fast. This is news to me and very interesting. Thanks so much for sharing.

      Todd

    • Todd Geist says:

      I just thought of something. What happens if the file is cloned and and then the data is imported?

      Your stored and un-stored versions of the Record ID would no longer match right? You would have to make sure that you went into the DB and and Reset the Storage on that field twice, once to un-store it and once to store it again.

      This is not a big deal, accept that it requires somebody manually do something after doing something else, which is almost certainly going to fail at some point.

      Can you think of some way around this?

      UPDATE: I suppose you could use a regular number field instead of a stored calc. And then update it via script after a clone and import. It still worries me a bit though.

      Todd

      • Daniel Wood says:

        hi Todd, thanks for the reply. Because the stored version of Get(recordID) is still a calculation, it works nicely and does not break when cloning is involved. This is because when you start with a clone and import the data, after the import the stored calculation is once agian re-evaluated (you don’t actually import either calc) so the unstored & stored are effectively reset.

        So if you have a table already with 500 records in it with ids 1-500 lets say, and you then import into the table another 500 records from another file where the Get(RecordID) is also 1-500, then when the second set of 500 is imported their Get(RecordID) is reset to 551-1000 accordingly – there is no duplication of the ID at any stage.

        The only situation where this does arise in normal practice is when people use Get(RecordID) as their primary key in a properly stored field, so that it becomes a static piece of data in a field, and then they use it as foreign keys on other tables. In this situation yes things will break – but so long as both ID fields are calculations and never stored as static data, then this won’t occur.

    • Todd Geist says:

      So I finally got around to trying GetRecordID with MasterDetail. It does help quite a bit on large record sets. You can read about it on my blog

  10. Daniel Wood says:

    I should also follow this up with a little bit more info – my test case scenario, while 200,000 records was actually an optimal scenario of the entire found set, with IDs starting at 1 thru 200,000 in order. In a proper real-world scenario your found set will consist of records not in order, but with different IDs (though still in order themselves) but not in sequence.

    The way FMS is able to send me the 200,000 so fast is it does it in ranges. It knows the record IDs of the found set are 1-200,000 so it can send me this info and my assumption is either server or client can fill in the gaps with the actual IDs. It’s very similar to if you have ever looked at a snapshot link file for a found-set and you’ll notice the foundset to display is shown in ranges of record IDs.

    So in order to get around this I modified my found-set of 200,000 to the worst case scenario – every second record omitted so that there was never a streak of consecutive record IDs in my found set, which was now 100,000.

    In this test, the time taken to produce the IDs is much slower, though still significantly faster than other methods given that there is still no record data being sent from server to client – however what is being sent from server to client in this case are the individual 100,000 record IDs – no ranges can be sent from server to client, and it is up to client to piece together the IDs sent. In the best case scenario, server sent 1-200,000 and client filled in the gaps.

    While there is a big difference between best and worst case scenarios, most real world scenarios will a) work with far fewer records, and b) tend to have some sequences in the records, and if records are in sequence then their IDs are sent faster than if they were not in sequence.

    Hope that makes sense!

  11. Justin Close says:

    Todd and Daniel,
    Just happened upon HyperList when looking up solutions for how to collect a list of unique identifiers for some records. I love that you all are contributing to the community like this, trying to improve not only our solutions, but our understanding or FileMaker to make us better developers.

    The background for my particular scenario: On a hosted solution a user navigates to a found set of records, arrived at through a rather complicated filter that results in some 150-300 records (filter has various variable inputs). This layout shows lots of column data, as well as having many conditional formatting calculations. Some data was being shown from related tables (like a name or title); some CF calcs relied on related data, too.

    Our overall goal: improve performance of this list view (it had been taking 20-30 seconds to load, scrolling was very hesitant). In order to improve load times and sorting/scrolling we have been eliminating calculations (e.g. moving them to scripted or auto-enter calcs), and then also just putting copies of data into the local table so it didn’t have to transfer related records. (Yeah, I know, not great normalization, and can add a layer of complexity, but doing this makes the load time and scrolling MUCH MUCH better, and it doesn’t require increasing the scope of the task to include re-architecting our entire DB. Our table architecture isn’t the greatest to begin with, i.e. very wide tables, so we are putting something of a band-aid solution in place here.)

    We have 5 fields in particular that were related data being displayed on the layout (rather static data). We are copying this data into the local table. So we now have to make sure the local copy is accurate. To do this we could have put script triggers on all the layouts where these 5 fields might have been editable, but that seemed cumbersome, spread out, and prone to mistakes. So I was trying the idea of doing a data-check-and-copy at the time of layout load. Doing this on the client didn’t really improve anything, as it had to download the related record (one with a huge table width) in order to update the local data.

    I was able to limit the amount of data that needed to copy and update by creating a relationship that used the 5 fields in question (plus the primary key field), and then just do an ‘exists’ check to the other table to see if they were already the same. That helped some.

    My solution (so far): use an FMXML server-side script trigger trick (waiting anxiously for 13! 🙂 ) that would recreate the current user’s found set and do the updating on the server side for those same records. To do that I needed a way of recreating the current found set quickly, and hence I was looking for ways to get a list of some unique ID for each record. This list is then passed to the server side script as a parameter. (I could have recreated the same filter & find routine as the user goes through on the client, but that didn’t seem like it would work; I guess I didn’t end up trying it…)

    What I found, though, was that HyperList was noticeably slower at generating the IDs on the client than a simple ‘loop through all records and build string of PKs’. Here are some numbers that I ended up with (ran each test after closing/opening the file & navigating to the layout in question before manually triggering the script; i.e. no caching):

    Loop: .012, .012, .013 , .012
    HyperList: .088, .261, .093, .091, .175

    So some questions:
    Todd, you mentioned above that you were planning of updating HyperList to not use Evaluate(). Did that happen? I did see one script step with ‘Evaluate(…)’ in it that was commented out. Was that the only instance?
    Is HyperList highly found-set-count dependent? My found set in this case was only ~200 records (and that is really all it will probably ever be). Would higher record counts start to show more differences?

    One possible flaw with the looping technique (which I wasn’t necessarily able to contrast against HyperList because it was intermittent): I found a very few outlier conditions where the local loop for ID-building took 15-20 seconds. Yikes! I had been testing this for numerous hours and never had a problem; then at the end of the afternoon I started to periodically see this spike. I was thinking that maybe Daniel’s suggestion of using the Get(RecordID) calculation might alleviate that weird outlier spike, or perhaps even make the sequence faster. (Hard to beat .012s I guess. 🙂 )

    Thanks for listening,
    Justin

    • Todd Geist says:

      Hi Justin,

      Hyperlist is optimized for large numbers of records. You probably won’t see a difference until you get beyond 1000 records. The actual number will depend on your key length.

      The current version has all the improvements that have been suggested to date. On my desktop it can gather 200,000
      record IDs in under 10 seconds. A straight looping script would probably not ever complete at the number of records. It would just spin for ever getting slower and slower.

      Hope that answers your question

      Todd

  12. […] key.  Its really rather brilliant and turns out to work quite well.  You can read his full explanation over at Modular FileMaker. Its in the comments for […]

  13. Paul Jansen says:

    I played briefly with using the new summary field type ‘List of’ and it seems pretty fast locally, but variable on a WAN solution. More homework required I guess…

    • Todd Geist says:

      The problem with ListOf is that it is a Field. You have to define a summary field for every field you want to gather. Yuck!

      Also try HyperList with Perform Script On server. Talk about awesome 🙂

      Todd

  14. john renfrew says:

    so what would be needed is a standard way to package up the search that creates the found set so it could be passed as a ‘parameter’… as some abstracted module to then run on server ideally.
    I cam across the issue testing that PDF stuff I showed at Pause as the unexpected result I got at first were because the server is a different user so was on a different record to me when I ran the script there. – works really week now though.

  15. john renfrew says:

    *well*

  16. Greg says:

    Is thee some way to gather more than one field?

    • Todd Geist says:

      Hi Greg,

      I know that Jason Young over at SeedCode is working on a variation that can handle up to three fields at once. I’ll see if I can get him to finish it up. 🙂

  17. Greg says:

    Thanks, Todd

  18. Paul Jansen says:

    Interested to see version 2. A couple of questions..

    Is version 2 a drop in replacement for version 1?
    How does ‘List of’ perform on a WAN compared to hyperlist v2?

    Thanks

    • Todd Geist says:

      well… define “drop in” 🙂

      There is still a single script that gets called. And it can still take just a single field, same as before. It is backwards compatible. But you are going to need to do a little tiny bit more then just copy in the contents of that one script. What I would do is make a note of all the places I am using it. Delete the old version’s folder and copy in the new version. The re-connect the code to the new version.

      It’s about as “drop-in” as you can make it 🙂

      ListOf is at least 2x the speed of HyperList v1 over the WAN.

      Todd

  19. Tim Anderson says:

    I am getting an odd result with HyperList2 using the script parameter
    “Contacts::zkUIDp¶Contacts::nameFirst¶Contacts::email”

    The 44th result returns the nameFirst from the 43rd record – the zkUIDp and email are from the 44th. This repeats every 100 records – that is the second parameter for the x44th record comes from the x43rd record, the other fields returned are from the x44th record.

    FMPA13 and FMP12.0v5, OSX 10.7.5 – local file

    Anyone else seeing this – seems very odd! I have tried adding a 4th field – that works as expected

    Thanks

  20. Tim Anderson says:

    Thanks Todd,

    always the second field, whether there are 2, 3 or 4 fields

    Tim

  21. […] Anderson reported an awesome bug with HyperList version 2.0.  I wanted to jump right in there and start fixing but the main HyperList script is a […]

  22. Tim Anderson says:

    Thanks Todd.
    I’m going to have to go and look at the code now, had been keeping clear but having watched the video…
    Love the idea of Unit Tests as well.

  23. Tom R. says:

    Hi Todd –

    In your video over at https://www.geistinteractive.com/2014/01/08/gathering-filemaker-records-hyperlist/, you mentioned there is another video detailing use of Hyperlist in creating a picker solution. I haven’t been able to find the video, but it would certainly be interesting to see it and/or a technique file.

    Thanks!

  24. Todd Geist says:

    Hi Tom,

    All of my videos are on my site. Maybe I was talking about Selector Connector. I think I used Hyperlist in there. I can’t remember. I am in full on DevCon mode right now, so looking for it will have to wait.

    Sorry

    Todd

Leave a Reply

Your email address will not be published. Required fields are marked *