Managing Performance

  The selected file /tmp/filetaH2ng could not be uploaded, because the destination sites/default/files/css/css_a91cd284035e73ba34d112bfed2f5a16.css is not properly configured.
  The selected file /tmp/fileCpCbk9 could not be uploaded, because the destination sites/default/files/css/css_de1aa27776f1462d2b0d83a80ce4b92d.css is not properly configured.

Managing Performance

This class will teach you how to make the right choices to ensure fast application performance.

Prerequisites

What You'll Learn

After completing this course, you should be able to answer these questions:

  1. What is the difference between performance, scalability, and elasticity?
  2. What is acceptable performance?
  3. Who is responsible for performance?
  4. What application constructs tend to perform well or to improve performance?
  5. What application constructs tend to cause slowness or reduce performance?
  6. What are some of my options to improve performance at a larger scale?

Performance, Scalability, and Elasticity

Performance

Performance is a general term that probably is most commonly thought of as the perception of the speed of an application by its users. The concept of performance actually encompasses a wide range of variables and characteristics ranging from the hardware to the virtualization layer to the operating system, database and scripting languages, through the network and the internet, and ultimately ending up on a client's browser. Many different variables can each make a meaningful impact on performance, as discussed below. However, there are two concepts that deserve special mention: scalability and elasticity.

Scalability

Scalability refers to an application's ability to grow in the number of users it serves, or the size of the database it manages. As these factors increase, absent some intervention, performance will eventually suffer. Therefore, one needs to consider performance requirements “at peak load”, meaning with the number of users and amount of data that one ultimately expects to manage. What's more, if one anticipates these numbers may grow, then a sense of the scalability of your application becomes important.

Elasticity

The concept of Elasticity represents a different perspective on the discussion of performance. Elasticity in this context refers to the innate ability of the tools you use to expand and contract as your need for performance and scalability expands or contracts. Elasticity is a great feature for your vendor to provide because it gives you an added avenue to address any performance or scalability issues you may be facing, in particular when those issues fluctuate rapidly or come on unexpectedly.

Elasticity does have categories however. Complete elasticity for example refers to a type of elasticity where you never have to do anything in order to receive it. For example, storage requirements on Amazon's S3 service are completely elastic in that you never have to pre-define or modify a total expected storage amount. Quite simply, as you require more storage, more storage will be available. Partial elasticity however means that although it may be easy to make changes, they don't happen automatically. For example, Amazon's EC2 appliance server service is only partially elastic, in that you need to predefine the amount of processing power or RAM memory you will need, and if your need changes, you may have to jump through some hoops to get your software onto a server with a higher resource level. To expand the discussion, though, choosing WorkXpress as the platform to manage your EC2 infrastructure makes your EC2 choice more elastic by providing simple point-and-click tools to completely move your software from a lower-powered EC2 server to a higher powered one, and vica versa.

Some of the tools that Cloud Computing typically makes either partially or fully elastic include:

  • Processor power
  • RAM memory
  • Bandwidth
  • Load balancing
  • Storage capacity
  • quantity of servers in a cluster (see below)

General Performance Considerations

Performance can be impacted by many things. It is very important to understand that any one of these things, and much more, could be the root cause of a perceived slowness in a particular application:

Client Side

  • Use of a cutting edge browser (For example as of November 2009, Google Chrome was the fastest performing browser tested, and can produce noticeable speed increases on complex pages over slower browsers)
  • Use of a computer with effective processing power
  • Use of a broadband internet connection
  • Lack of other processes on the client machine competing for resources
  • Proper browser settings to enable image or other caching

Server Side (Your Infrastructure choice)

  • Use of a server that is amply resourced in terms of processor power and RAM memory for the requirements of your application given its user load
  • Use of a server that has fast disk access
  • Use of a server whose virtualization software is up to date including the latest tools and optimization capabilities
  • Access to appropriate bandwidth to serve the demanded load
  • Server properly routed to the internet through a router capability of handling all of the traffic assigned to it
  • Server properly routed through a LAN properly sized to handle the traffic passing through it

Platform Side (WorkXpress)

  • The WorkXpress platform itself must be developed with performance in mind. Any capabilities it offers need to be optimized to the fullest possible extent, including how it manages the infrastructure and bandwidth resources allocated to it.

Application Side (your Software)

  • Your Application must be constructed with an appreciation for all of the limitations mentioned above both client side, with your infrastructure and with the WorkXpress platform.

Again, it is important that the WorkXpress software builder be aware that all of the above and more contribute to overall user performance and, therefore, user experience. However, next to your infrastructure choices, the factor most controlled by you, the builder, is how you build your application (the software factor).

Therefore, the remainder of this article will focus on Application Side Performance opportunities and risks that you can control as you build your application. Specifically, it will discuss those considerations from the standpoint of your data layer, your interface layer, and your logic layer.

Data Layer Performance Considerations

Quantity of data

Put simply, the more data you have in your production application, the more work the computer will have to do when it access those data stores.

The good news, though, is that large data sets by themselves do not materially impact performance. In fact, accessing very large data stores efficiently can be faster than accessing small data stores inefficiently.

You'll learn more about the concept of efficient data access in the following sections.

Numbers, Short Text and Long Text

Different types of data are intrinsically faster or slower to search or access than others. WorkXpress distinguishes three primary types of data, and they are (from fastest to slowest):

  1. Numbers
  2. Short Text
  3. Long Text

Searching by "Starts With" , "Is" and "Contains"

Some methods of searching for a value are much faster then others. Building a search where the result list “is” (is equivalent to) something else is very fast. Building a search where the result “starts with” something is also very fast.

However, building a search where the result list simply “contains” the search string in question can reduce performance noticeably. This condition is exacerbated by particularly long blocks of text and larger data sets.

As a result, you may consider restricting the use of “search contains” operations in lower-performing areas of your system.

Length of query path

When using the Visual Query tool, you have enormous power to design highly complex and long queries. Just because you are given that power, however, does not always mean you should use it! The more complex the query you design becomes, the lower performance you will get from your query.

But what do you do when you simply need a long query? There are a number of ways to minimize your dependence on very long queries, and some of them are discussed in greater detail in this section. One great way to shrink a query length is to “preprocess” by adding additional relationships or making duplicate values, and this is covered next.

Postprocessing and Preprocessing

Very commonly you can significantly improve the perceived performance of your application by shifting effort from “postprocessing” to “preprocessing”. These terms are described below:

Postprocessing - performing work at the time it is requested by a user.

Preprocessing - performing extra work at some previous time, so that when a user requests the work there is now less that needs to be done.

Using relationships to preprocess

For example, rather than build a “very long query”, why not create a relationship between two Item Types that significantly shortens the required query (something that jumps you right from the beginning to the end so to speak)? You could use Actions to automatically create that relationship at some earlier point in your applications process (i.e. you could “preprocess” the creation of that relationship) so that when the need arrives, a smaller query will generate the exact same desired result as the longer “postprocessed” query.

Using data duplication to preprocess

In another example, you must use a query to grab data from a variety of Items. Because of the size of your data sets, the nature of your interface, or the volume of automations occuring, you determine this to be underperforming.

Why not at some previous time use Actions to make copies of the desired data on Fields stored against only one Item Type? Then, when you go to run your query, you can grab all of your data off of a single item, which is very fast!

In this example, you have “preprocessed” the copying of data from a variety of locations to a single location to reduce the “postprocessing” time and make the user experience better.

Starting big versus starting small

Imagine this scenario. You have 1 million buildings in your database, and each building has 100 rooms that are related to it. You create two Item Types; building and room, and you create one relationship: building to room.

Later, you want to write a query that finds all the buildings that have rooms with purple walls. Imagine though that in all 100 million rooms, only one of them actually has purple painted walls.

One way to find this building and room would be to design your query with Rooms at the top, going across the relationship, and then having each of your buildings. You would add a filter on Rooms for “purple walls”.

A second way to build this query would be to put Buildings at the top, going across your relationship to Rooms, and then putting the filter on Rooms for “purple walls”.

Either result will get you what you want, however, one of those will get your result MUCH more quickly!

In fact, the first way will be better. This is because the system will quickly find just the room which is purple, and proceed to only find a single building connected to it.

In the second way, the system will get all buildings, then get all of their rooms, then find the room which is purple, and then restrict back to the only building being asked for.

The difference has to do with whether the system needs to evaluate all 1 million buildings, or only just one!

Generally speaking, if you can put the more restrictive elements of your query towards the top, then the system will have fewer and fewer records to evaluate as it progresses down through your query.

Keep this hint in mind!

Evaluated filters and nesting evaluated filters

One of the Workxpress Query Tool's more interesting and powerful features is the ability to build dynamic evaluations. These are evaluations where the value being evaluated against is itself another variable! What's more, there are no limits to how many times you can nest these evaluated filters.

For example, a normal filter might be “where the State is <Pennsylvania>”. An evaluated filter instead might be something like “where the State is <check the Current User, find which Sales Team he is related to, find which State that Sales Team is responsible for, and insert that State here>”.

Be ware though! Adding evaluated filters will make the process of evaluating a filter that much more slower to perform, particularly when the sub-evaluation itself is complex or slow.

Evaluations are fastest when performed against a constant, and when done purposefully, can be performed against sub evaluations as well.

Sorting

Every time you make extra work for the system, keep in mind that it may make you wait while it does that work. Sorting is a great example. While it is fast for a database to grab “all records of type X”, it is not nearly as fast to “grab all records of type X and sort them based on Field Y”.

This slowing effect can be compounded heavily when sorting on things that are themselves a variable, like evaluated fields or formulas. Think about it; if you want to sort on a field of static text or numbers that's one thing, but if you want to sort on a field that first requires some calculation in order to even determine the value to sort on…well…against a very large record set this can be very time consuming!

Use sorting sparingly, and avoid sorting on calculated or variable fields in all but the smallest record sets. If you must sort a large record set based on a calculated field, consider “pre-processing” that field as described above.

Fields with Parts

Some fields are more complex than others, and require more care than others. For example, an “Address” field contains a variety of sub-components to it such as “street 1”, “street 2”, “city”, “state” and “zip”. We call each of these sub-components “field parts”.

Fields with parts are intrinsically only slightly less performing then fields without parts, and you are free to use them. However, be extra cautious when, for example, sorting based on a part of a field. This creates a scenario where the system must first identify the part of the field thats being sorted on for ever single record in the system, and then perform the sort. Remember that making more work for the system on large record sets can mean making more wait time for the User.

Also, when performing mass operations against a part of a field (as compared to against an entire field) you can also notice performance reductions. It's one thing to say “for each of my 10,000 records, go update the status to read <X>”. It's a different thing altogether to say “for each of my 10,000 records, update the part of the address field called “state” to read <Y>”. In the first scenario, the system simply has to go to the field and make the change. in the second scenario, the system must go to the field, extract the part, change the part, and insert just that part back into the field.

Field parts are an extremely powerful feature, use them wisely!

Interface Layer Performance Considerations

Complex pages

Put simply, the more you put in your page, or the more you ask your page to do, the longer it will take. If you make a single page that displays every layout about everything, that page will take a lot longer to load than a page highly focused on addressing a specific need.

Remember that for everything on your page, the server needs to assemble it, the internet needs to transport it, and the Users's browser needs to interpret and present it. The more your page attempts to accomplish, the longer each one of these steps will take.

Open/Closed and Showing/Hiding Layouts on page load

Keep in mind that not every Layout needs to be “open” when the page loads. You can always default a Layout to “closed”. When (and if!) a User needs to see the contents of that Layout, they can choose to open it manually, and all the work will be performed at that time. Starting a Layout as “closed” shifts the work from every single page load to only those times when a User actually needs to see that content and asks for it.

Another technique is to apply security to “hide” a layout. You might use this when there are Layouts on page which are only relevant to certain people at certain times, and at other times they are completely unnecessary. For example, perhaps there is a Layout where one needs to see “emergency response” instructions, but only when the status of the item is set to “emergency”. In this example, the “emergency response instructions” layout is not required to be visible most of the time. So, why not use security to “hide” the layout except when the status is “emergency”? This way, the page only needs to load that layout on page load in the rare circumstance that the information is actually needed

List Layouts

Lists are a common source of performance degradation. You can make a list slow by making the “List of…” query long or complex (see Data Layer Considerations above) but you can also make a list slow by building columns of information that are themselves cumbersome to develop (for example making a column of “evaluated fields”, or a column of “report fields” see below). Finally, you can make a List Layout slow simply by asking it to display a large number of values at a time.

The first thing to consider with List performance is the query you've used to derive the list itself. This is defined in the “List of…” parameter in the List configuration tab of the Edit Blocks tool. All of the point discussed above in the Data Layer Consideration sections apply to optimizing the performance of the “List of…” parameter.

The second thing to consider is the complexity of the columns you are choosing to display. Displaying more columns and more complex columns will take longer, and these delays will be compounded as the number of records your list is set to display increases. “More columns” and “Longer lists” are easy enough to understand, but what is meant by “complex columns”? Keep in mind that every value for every combination of row and column must be processed. If one of those columns involves “long queries” to get the result, or “complex queries”, or if they involve a complex display mechanism such as an “evaluated field”…any of those more complex types of fields take an increasing amount of processing and could result in delays.

Whatever you do when displaying lists, keep in mind that the longer you set your list to be, the more effort will be required to process it. You can always add the ability to generate filters at the top, or move work off to a “report”, either of which will often lessen the need for long lists. However, if you must display a long list, think carefully about the columns you absolutely must have and pare that list down accordingly.

With lists, the bottom line is the more you display, and more complex the fields you display, the slower the list will be.

Evaluated Fields

There are a variety of Field Types in WorkXpress, and some can take longer to present to a User than others. One excellent example is an “Evaluated Field”. This type of Field performs an evaluation at load-time, and based on the results, determines exactly which Field to display. For example, an evaluated field could display the word “okay” colored green if a status is “good”, but the same field could display the word “warning!” in red if a status is “not good”.

By themselves, Evaluated Fields typically do not significantly influence performance. However placing an Evaluated Field as a column of a list means that list has to display the Evaluated Field once for every row it is displaying, and this can begin to slow it down. So, if you have 100,000 records but you want to display 30 in your list, the system will need to evaluate the field 30 times.

However going a step further, if you elect to SORT your list based on the Evaluated Field column, the system must now perform your evaluation for each and every result in the list, whether displayed or not. So, if you have 100,000 records but only want to display 30, you would still need to determine the result of the evaluation for 100,000 records before presenting your list.

Evaluated Fields are an extremely powerful interface tool, however, be judicious in your use of them.

Report Fields

A Report Field sounds innocent enough on the surface, but the truth is it can be a performance reducing monster! A Report Field exists quite simply to run a report, and then display its results. You can embed a Report Field anywhere in your application that you could otherwise place a Field, and this includes as a column in a List. So, just imagine, you are displaying 50 results in a list, and for each result, a report is running in the background and then displaying. If that Report has a complicated data query (see Data Layer Considerations above) or if it is running against a very large data set, then 50 reports may take quite a while to run.

Report Fields are terrific because they let you present complex results to your Users at just the right time. However, too many of them at the same time may hurt more than they help!

Changing Context

We learned in our lesson about Context that every Building Block in the interface layer needs to have knowledge of what it is to display data about, and that this concept is called “context”. We also learned that we have the power to change that context at any point in the interface or logic. What we didn't yet learn is that this power is not to be abused!

On one hand, changing context is exactly what allows us to build rich interfaces about disparate and/or related sets of data. On the other hand, changing context represents extra work the computer will have to do in order to display your page.

The bottom line with context is that inherited context is very fast, and does not impact performance. However changing context causes a small speed bump in the road to loading your page. The question you need to answer as you design your application is “how many speed bumps can your Users live with?”

Logic Layer Performance Considerations

Data Layer Performance Considerations Apply

Actions are heavily dependent upon access to the data layer through use of the Query Builder tool and the Expression Builder tool. Every time this access takes place within an Action, the performance considerations of data layer access (described above) will apply.

Lots of Automation

Like anything else, the most fundamental consideration when evaluating the impact of automation on performance is simply “how much automation is being asked for?” The more automation, in general, the greater the impact on performance.

Cascading Actions (Invoking Objects)

As a corollary to “Lot's of Automation” above, any Action that as a course of its work invokes another building block and it's own Actions (invokes an object), the quantity of automation involved may begin to grow. As always, more automation means more performance degradation, so stay on a sharp lookout for large chains of invoked Actions.

The good news here is that any invoked Actions are completely exposed by drilling down into the procedure you are evaluating, so it shouldn't be much trouble discovering them.

Infinite Loops

As a corollary to “Cascading Actions” above, it is theoretically possible to have Object A invoke Object B, but then to have automation on Object B that invokes Object A. Because A still has automation to invoke B it does, and likewise B again invokes A. This cycle may never stop, resulting in an “infinite loop”.

The result is a thread of processing that will be permanently stolen from your infrastructure, and may result in a noticeable performance reduction.

You definitely do not ever want to build any Infinite Loops into your application if your goal is to keep your Users experience a pleasant one.

Actions which could be "Threaded"

Not all automation needs to hold up the Users experience. For example, when a particular page saves, there is a series of automations that result. Normally, the User must wait for this procedure to complete before the page indicates that it has finished saving, and loads the next page. However, if the results of some automation is not expected or needed by the User immediately, there is little reason to force them to wait for the automation to complete.

The WorkXpress Action Manager tool gives you the ability to “thread” Actions. What this means is that the Action and any of its children are ran separately from the current procedure, so that the rest of the procedure is allowed to finish. The now “threaded” tree of Actions is moved into to a queue along with all other queued Actions, and processed according to the rules of the Queue.

The important thing is that the User is not waiting on these Actions to complete. The User's page can save, the User can move to the next page, and those Actions which the User triggered with their “save” can be run in the background.

Actions which could be "Scheduled"

There are some Actions which are not triggered by a Users activity, and which can be run at any time. For example, mining your customer database for “leads who have not received an email in the last 30 days” so that the system can automatically send them a reminder email is the type of automation that probably does not need to run at a particular time of day.

Rather then running these Actions during a busy time of day when system resources may be constrained, and thereby impacting performance, these Actions can be “scheduled” to run for example in the middle of the night when no one is using the system.

Scheduled Actions get placed into the same queue as threaded Actions, and are run according to the rules of the queue. However the important thing is that they are not likely impacting the Users experience.

Other Performance Considerations

Tenancy

In a multi-tenant environment (see 203) it is important to remember that multiple applications are relying upon the same set of physical resources. One simple way of improving performance of a multi-tenant application is to use the WorkXpress Cloud Management Portal to move your application into its own, single tenant Infrastructure.

Bigger Infrastructure

Sometimes you just have more demand for resources than your infrastructure can handle. Fortunately, in the cloud, its relatively easy and affordable to simply get bigger infrastructure. See your WorkXpress Cloud Management Portal and associated documentation for more information.

Clustering

As described in WX 203, Clustering is a method whereby multiple Virtual Appliances all work together to increase the power and capability of your infrastructure. For larger installations or extremely high data volumes, clustering may be able to help you.

Conclusion

Performance generally describes your users' experiences with your application. Scalability describes your application's ability to handle increased users or volumes of data. Elasticity refers to your infrastructure's ability to provide greater capabilities as needed.

You as the application builder have tremendous latitude over the performance of your application. You can design things that are fast, and you can design things that are slow. WorkXpress build tools don't limit you in these ways, but they do serve to advise you from time to time.

Generally speaking, the more work you make the computer do, the slower your application will be. You can create work for the computer in the Data Layer with your Query Tool and Expression Builder. You can create work for the computer by designing complex interfaces. Finally, you can create work for the computer by designing elaborate automations.

While each of these are possible to do and still deliver good performance across large user and data volumes, it can become important for you as the builder to understand and tweak your application design to meet your performance goals.

Sometimes though you can't tweak your application anymore, but you still need to grow. When this happens, you have a variety of options to modify or increase your Infrastructure to support the growing demands, and to maintain performance.

You are now ready to build an application! Please proceed to Building Your First Application to continue on your path to Cloud Mastery!

402/managing performance.txt · Last modified: 2016/09/14 18:19 (external edit)
Copyright WorkXpress, 2024