Scalability guide

Scalability guide

This document covers the specifications and recommendations to analyze large volumes of data using Analytics Plus.

Scalability ceiling for optimal performance

There are various factors that influence optimal application performance. The following table outlines the recommended higher limits that should be considered while setting up Analytics Plus.

FactorRecommended higher limit
Maximum number of rows in a single table (periodic data fetch)200 million
Maximum number of rows in a single workspace (split across several tables)500 million
Maximum number of rows in the application (split across several workspaces)

This depends on your server's processing power (cores and memory). Increased number of rows in one workspace would not affect the performance of other workspaces within a single installation. However, an increased number of concurrent users may reduce application response time.

An installation with data split across multiple workspaces will deliver better performance than an installation with data concentrated in one workspace.

Maximum number of users accessing the application simultaneously1050 (50 users, and 1000 Viewers & Concurrent Guests) can use the application at the same time without performance degradation. This limit can increase if the volume of data being accessed simultaneously is less than 500 million rows.

Note:You may increase the number of data rows without compromising application performance if there are a lower number of concurrent users in the application.

Recommended configuration for high scalability (Distributed mode)

For better performance when dealing with large data volumes, it is recommended to run the application layer and the database layer in two separate servers i.e. in a distributed mode. You can experience a 25% increase in performance while running the application in a distributed mode, as opposed to when both the application and the database is on the same server.

You may run the application in a distributed mode even if EITHER ONE of the following statements are true.

  1. You are looking to analyze more than 100 million rows of data using Analytics Plus.
  2. You expect more than 50 users to access the application simultaneously at any given point in time.

System requirements for distributed mode

Considering a maximum of 500 million rows of data in your application, the following are the minimum system requirements for improved scalability and performance.

System requirements for the application server:

  • 32 GB RAM
  • 250 GB SSD
  • 6 core processor
  • 975MB/sec download network speed and 947MB/sec upload network speed

System requirements for the database server:

  • 128 GB RAM
  • 2 TB SSD
  • 32 core processor

For higher volumes of data, please write to analyticsplus-support@manageengine.eu (EU region) or analyticsplus-support@manageengine.com (Other regions).

Running in distributed mode

Follow the steps below to configure your database on a separate server:
Step 1: Install Analytics Plus on a server (refer to the server specs in the previous section). To move the database to a separate server, stop the Analytics Plus application. Refer to this documentation for details on stopping the application.

Step 2: Copy the <Analytics Plus home>/pgsql folder and paste the contents to a second server. This will now become your database server. The server hosting the Analytics Plus application will be your application server.

Step 3: In the application server, navigate to <Analytics Plus home>/conf folder and head to the app.properties file. It is recommended to save a copy of the file in a separate location before making any changes. Open the file present under the <Analytics Plus home>/conf folder, update the values as shown below for the respective lines and save the file.

  • zrop.non.txn.cp.max.size = 40
  • zdb.querytable.refresh.skipfilewrite = true

Note: Additionally, for Linux installations, navigate to the wrapper_linux.conf file under the <Analytics Plus home>/conf folder, update the values as shown below for the respective lines, and save the file.

  • wrapper.java.initmemory=16384
  • wrapper.java.maxmemory=26624

Step 4: Under the same <Analytics Plus home>/conf folder, navigate to the server.xml file, make the following changes, and save the file.

  • maxThreads="1200"
  • acceptCount="600"

Step 5: Navigate to the customer-config.xml file under the <Analytics Plus home>/conf folder. Add the following line at the end and save the file.

  • <configuration name="StartDBServer" value="false"/>

Step 6: Open the database_params.conf file under the same folder, and search for the following line:

  • url=jdbc:postgresql://localhost:<portnumber>/zreportsdb?OpenSourceSubProtocolOverride=true&charSet=UTF-8

Replace localhost with the hostname or IP address of the database server.

Step 7: Now switch to your database server, and navigate to the <pgsql home>/data folder. Open the pg_hba.conf file, enter the following line and save the file.

  • host all all <IP address>/32 md5

Replace <IP address> with the IP address of your application server. For example, if the IP address of your application server is 192.168.1.1, your entry should read host all all 192.168.1.1/32 md5

Step 8: Navigate to the postgresgl.conf file in the same folder, add the following line at the end of the file and save.

  • listen_addresses = '<IP address>'

Replace <IP address> with the IP address of your application server.

Step 9: Open the postgresql.auto.conf file found in the <pgsql home>/data folder in your database server. Search for the following fields in the file, and update the corresponding values as displayed below.

  • maintenance_work_mem = 2 GB
  • min_wal_size = 4 GB
  • max_parallel_workers_per_gather = 32
  • effective_cache_size = 204800 MB
  • effective_io_concurrency = 0
  • shared_buffers = 32 GB
  • max_wal_size = 20 GB
  • checkpoint_completion_target = 0.9
  • wal_buffers = 32 MB
  • default_statistics_target = 100
  • max_worker_processes = 64
  • max_parallel_workers = 64
  • work_mem = 1 GB

Step 10: Navigate to the <pgsql home>/bin folder in your database server, and execute the following command to start the database.

  • For Windows installations: set "PATH=%PATH%;<pgsql home>\lib" pg_ctl.exe -D ..\data -o"-p <portnumber>" start
  • For Linux installation: ./pg_ctl -D ../data -o"-p <portnumber>" start

The following command can be executed to stop the database:

  • For Windows installations: pg_ctl.exe -D ..\data -o"-p <portnumber>" stop
  • For Linux installations: ./pg_ctl -D ../data -o"-p <portnumber>" stop

Replace <portnumber> with the port number on which the database runs. The default port number is 33366. Please verify this port number in the database_params.conf file present in the <Analytics Plus home>/conf directory of your application server.
Note: Ensure the database port is open for communication between the application server and database server.

Step 11: You have now successfully split your application and database servers. The recommended bandwidth between the two servers is as follows.

 IntervalTransferBandwidth
Application server0.0 to 10.0 seconds1.18 GBytes1.01 GBits/second
Database server0.0 to 10.0 seconds1.18 GBytes1.01 GBits/second

Step 12: Start the Analytics Plus server on the system where it is installed.

  • For Windows installations: Start the Analytics Plus service from your Windows services.
  • For Linux installations: Navigate to the <Analytics Plus home>/bin directory and run the command sh app_ctl.sh run

For detailed steps, refer this documentation. The application will now successfully run with a remote database.

Note: Please write to analyticsplus-support@manageengine.com for recommendations on fine-tuning your application, to maximize your application performance further.

Performance enhancements

The following are some additional things you can do to boost Analytics Plus' performance and improve response time.

Performance fine-tuning

In order to improve application responsiveness, you can increase the heap memory allocated to it. To do this, navigate to the <Analytics Plus home>/conf directory and open the wrapper.conf file (wrapper_linux.conf file, for Linux or Mac installations). Search for the string Maximum Java Heap Size (in MB) and change the value of wrapper.java.maxmemory to 4096. This allocates 4GB of memory to Analytics Plus' processes.

Note: Ensure that you do not allocate all available memory on the server to Analytics Plus. It is recommended to allocate 32 GB, or half of the available memory to the application.

Data import best practices

The following best practices help boost performance when importing data into Analytics Plus.

From files and feeds

When uploading large files (greater than 500 MB), it's best to upload them as CSV files. Smaller files can be uploaded in any of the supported formats without affecting the application's performance.

From local and cloud databases

1. It's always a good idea to have data in a single, unified (de-normalized) table. Generally, data is split into multiple tables in source databases to remove data redundancy and improve integrity. However, this reduces the efficiency while reporting. Use SQL joins to import only the relevant fields split across several tables in your source database.

2. Convert date values that are stored as unix timestamp to human-readable format before importing your data into the application. Performing date format conversions in Analytics Plus reduces the rendering speed, especially when handling large volumes of data.

3. Make static copies of source tables that are frequently updated, before importing them into Analytics Plus. Data mismatch can occur if a table gets updated during the import process. Avoid this by importing from a static copy instead of the live version. This is especially true for cloud databases such as Amazon RDS and other databases that are frequently updated.

4. Import only the relevant data. For instance, if you wish to generate reports for the last three months, import only the data from that period. You can utilize a 'where' clause in SQL queries to customize data import.

5. Deselect unnecessary columns while importing data. This will ensure irrelevant columns are not imported, thereby improving overall application performance.

6. Specify import frequency based on how often you view the reports, and how critical those reports are to your operations. For instance, a dashboard that is accessed at the beginning of every month doesn't need to be updated every hour. Additionally, frequently updating data in workspaces that have several query tables can affect your efficiency, as queries take up quite a bit of resources for their execution. Scheduling data import during non-business hours will enable you to use your resources more efficiently.

7. When scheduling data imports, you may simply wish to import newly added data to the application at regular intervals. In such cases, it is best to select theAdd records at the end option from theHow do you want to import drop-down. This ensures quicker import time, especially when importing large amounts of data.

Note: In order to use the Add records at the end option, the source table should be modified in a way that it only displays new rows of data that have been added since the previous synchronization. Please contact analyticsplus-support@manageengine.com to use this option effectively.

Formulas and query tables

1. Analytics Plus allows you to use powerful formulas to derive KPIs from your raw data. (To learn more about creating formulas, click here.) However, it is advisable to refrain from creating a large number of formulas within the application and instead, create them in the data source itself. This avoids repeated computation of formulas, and increases report loading time in the application.

2. Query tables are useful when combining data from multiple sources or tables. Though Analytics Plus doesn't allow nested queries, it is possible to include the result of one query table in a second query table. As a rule of thumb, limit such operations to three queries. When the nesting goes beyond the third level, loading the reports created from the nested table takes a long time, and in turn affects application performance. Instead, you can create such nested views in the data source, and import it directly into the application.

Reports and Dashboards

The below hacks help enhance performance while creating different types of reports in Analytics Plus.

Charts

While creating reports, it's best to limit the number of points in the x- and y-axes. This improves readability, and ensures faster rendering of the report. You can do this using filters. Click here to learn how to create a filter.

Pivot and Summary views

Refrain from adding too many columns to tabular reports. This reduces the application's performance significantly, since these reports require quite a bit of processing. You can instead create multiple tabular reports with fewer columns, which also improves readability. You can also utilize the compact layout option to collapse multiple columns into a single column. In general, tabular views also perform better when you refrain from using fields that have several unique values, such as the primary key of a table. Lesser number of columns in a pivot view also enables you to view a greater number of rows.

Dashboards

1. KPI widgets are used to display KPIs in a dashboard. Using formula columns as display metrics in your widget can increase widget loading time. Click here to learn more about KPI widgets.

2. User filters allow the viewer to filter the values in a report or dashboard. In scenarios where you know a certain filter value will be used often, you can set it as the default value. Default values limit the volume of data displayed, resulting in quicker loading times. Click here to learn more about user filters.

3. Limit the overall number of charts, KPI widgets and user filters in your dashboard. Adding a high number of views to a dashboard will increase its loading time.

Application maintenance

1. Perform a periodic cleanup of your databases by deleting old data. It is common for unused data to pile up over time. You can create a cleanup cycle based on your reporting needs. For example, if you create reports based on the data from the previous three months only, perform a cleanup every quarter to delete irrelevant data. If your data requires a more complex approach, please write to analyticsplus-support@manageengine.com and we will help you create and maintain a good cleanup cycle for your data.

2. Ensure you delete temporary reports, tables or dashboards that are no longer in use. This will free up the application's resources and boost performance.

3. We release new features and application updates at least twice each quarter. Upgrade your Analytics Plus application regularly to keep up with the latest enhancements and bug fixes. You can check for updates on this page.

Share this post : FacebookTwitter

Can't find answers to your question?