You have a web app, and you care about its performance. How does Tracelytics look at what it’s doing and help you optimize, troubleshoot, and save time and money? We want to show you how to review the structure of a simple LAMP stack, what insights you can get out of the box, and what kinds of problems you can diagnose using this approach.
Our Guest Today: LAMP
You may be familiar with this acronym: Linux, Apache, MySQL, and PHP. Of course, Tracelytics works with many different stacks, but LAMP is a very common web application engine. Here’s what the stack looks like, assuming that your database is running on a separate host, either physically or virtually:
Hosts + Layers = Apps
Before we get our hands dirty, let’s take a moment to break down the LAMP stack into various parts.
- Hosts: Web applications are software running on machines, be they physical or virtual. In this example, the Hosts are alice and bob. Hosts form the underpinning of performance with the resources they provide for the app to consume.
- Layers: Each component that is involved in servicing requests for your app is a Layer of the stack. In this example, Apache, PHP, and MySQL are each considered Layers. Typically, you might think of each service that runs in a distinct process as a different Layer.
- Apps: Finally, your entire production LAMP stack is a logical grouping of software components and hosts that work together to serve requests. Let’s call this an App. In the simple LAMP case, everything is considered part of the same App. If you have a separate staging environment, that’s a separate App.
How Does Tracelytics Work?
How does Tracelytics gather data about Hosts, Layers, and Apps? Each is handled slightly differently. The following diagram illustrates instrumentation points in green. Explanation follows!
- Hosts: You want to know whether your app is starved for resources or overprovisioned. For this reason, the Tracelyzer agent, installed on each host, gathers metrics about utilization of CPU, memory, I/O, etc. The Tracelyzer is a lightweight daemon, provided as an APT or YUM package, that you can install on every host in your deployment that you want to monitor.
- Layers: Each Layer is presented slightly differently by Tracelytics depending on what features are interesting: for example, when considering the MySQL Layer we are probably interested in queries, tables, and databases, and when looking at Apache we will want to see response codes and HTTP methods. Tracelytics follows requests through your stack, starting at the top layer, which is usually some kind of webserver. To get visibility into the request starting at the web server, you install instrumentation in the form of a plugin module (eg. Apache or Nginx module). In the LAMP stack, this instrumentation watches for when requests arrive into Apache from clients, when Apache passes along the request to PHP, when PHP replies with content, and when Apache finally passes that back to the end-user.
- What about getting more detail about MySQL and PHP? You can go even deeper, if you install instrumentation for the next layer of the stack down: PHP. That instrumentation, provided as a PHP extension, listens for incoming requests (e.g. when Apache hands them off), records interesting events such as calls to MySQL, errors, and more, and then notes when PHP finished processing the request and returned control to Apache. Because PHP is watching what queries it executes, there’s no need to run a modified MySQL server.
- Apps: An app is a logical grouping of Layers running on Hosts. It’s defined by you, the user, to help you keep track of everything you care about. By default, all new layers are added to an app called “default.” User-created apps are defined by their entry point; The point at which requests enter the stack. In our example, that is Apache running on alice. After you have defined the entry points for an app, Tracelytics will follow requests passing through the entry point and automatically populate the remaining hosts and layers in your stack. Because the discovery happens automatically, Tracelytics understands when multiple Apps are sharing Host or Layer resources (eg. a database). A Tracelytics account can support an unlimited number of apps, which can be useful to segment your performance data across different environments.
Solving Performance Problems
All of this information comes together in Tracelytics to help you solve web application performance problems. Here’s a few examples of finding problems in a LAMP stack.
- Request Queuing: One common tuning problem is getting the correct number of worker processes/threads in the application layer. When this number is too low, it can causes requests to queue in the webserver, a bottleneck adding latency to every request. Because Tracelytics starts monitoring requests at the web server level, it’s easy to catch this problem–if too much time is being spent in Apache before the request is handed off to PHP, this is the likely culprit.
In the picture above, we can see that a significant amount of time was spent in the webserver before it was able to pass the request to a worker thread at the app layer.
- Bottlenecks: What’s the worst leg of the critical path for a given request? We watch every query made in each trace, which lets Tracelytics show you the low-hanging fruit in terms of slowest queries, frequently-executed queries, and queries that keep users waiting the longest overall. And the same for remote service calls, methods in your app, and more!
- Query Loops: It’s generally faster to retrieve a set of rows from a table all at once rather than making a separate query roundtrip for each. However, an unfortunately common design pattern is to fetch a list of items, then execute some other query (eg. to get an attribute of the item) once for each. This can be hard to catch if you’re just looking at slow query logs, because generally each individual query is relatively quick. But they start to add up. Because Tracelytics is watching each called to request makes, it’s easy to isolate and optimize query loops.
- Underprovisioning: Easy juxtaposition of latency graphs and host-level metrics allows you to quickly determine if bad latency is related to lack of machine resources. That’s why we pull in host data along with tracing data:
Look at how the latency rise coincides with increased memory pressure and CPU utilization– Perhaps you just diagnosed a memory leak?
Let us know if you have questions or comments about this, or other ideas about ways you could use these tools to diagnose performance issues in your application.